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“Do-able” Questions, Covariation and Graphical Representation: 
Do We Adequately Prepare Preservice 
Science Teachers to Teach Inquiry? 

ABSTRACT 

The interpretation of data and construction and interpretation of graphs are central practices 
in science which, according to recent reform documents, science and mathematics teachers are 
expected to foster in their classrooms. However, are (preservice) science teachers prepared to 
teach inquiry with the purpose of transforming and analyzing data, and interpreting graphical 
representations? That is, are preservice science teachers prepared to teach data analysis and 
graph interpretation practices which scientists use by default in their everyday work? The present 
study was designed to answer these and related questions. We investigated the responses of 
preservice elementary and secondary science teachers, practicing science teachers, and scientists 
to data and graph interpretation tasks. Our investigation shows that despite considerable 
preparation, and for many, despite B.Sc. degrees, preservice and practicing teachers do not enact 
the (“authentic”) practices that scientists routinely do when asked to interpret data or graphs. 
Detailed analyses of written or videotaped answers on the tasks are provided. We conclude that 
traditional schooling emphasizes particular beliefs in the mathematical nature of the universe that 
make it difficult for many individuals to deal with data possessing the random variation found in 
measurements of natural phenomena. 
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If scientists were looking at nature, at economies, at stars, at organs, they would not see 
anything. . . . Scientists start seeing something once they stop looking at nature and look 
exclusively at prints and flat inscriptions. . . all laboratory observers ha[ve] been struck 
by the extraordinary obsession of scientists with papers, prints, diagrams, archives, 
abstracts, and curves on paper. (Latour, 1990, p. 39) 

Ethnographic research in scientific laboratories and scientific field work has shown that 
designing investigations, collecting data, transforming data, and interpreting the resulting 
representations are some of the quintessential scientific practices (Latour, 1993; Roth & Bowen, 
1998). Recent reform documents have increasingly called for such “authentic” practices in 
mathematics and science education which would allow students to engage in these subjects in 
ways that correspond to everyday practices in these fields (AAAS, 1993; NCTM, 1989; NRC, 
1994). For example, mathematics curricula in Grades 5-8 should enable students to (NCTM, 
1989): 

• describe and represent relationships with tables, graphs, and rules; (p. 98) 

• analyze functional relationships to explain how a change in one quantity results in a 
change in another; (p. 98) 

• systematically collect, organize, and describe data; (p. 105) 

• estimate, make, and use measurements to describe and compare phenomena; (p. 1 16) 

• construct, read, and interpret tables, charts, and graphs; (p. 105) 

• make inferences and convincing arguments that are based on data analysis; (p. 105) 

• evaluate arguments that are based on data analysis; (p. 105) 

• represent situations and number patterns with tables, graphs, verbal rules, and equations 
and explore the interrelationships of these representations; (p. 102) and 

• analyze tables and graphs to identify properties and relationships, (p. 102) 

These competencies mirror the daily practices of scientists with their focus on data collection, 
analysis, and presentation and are thus easily integrated with science curriculum reform at the 
same grade levels. In fact, the integration of mathematics and science school activities may not 
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only be interesting because children collect their own data, but may be essential for developing a 
thick layer of experiential knowledge that underlies much of scientists’ understandings (Roth, 
1996; Roth & Bowen, in press; Roth, Masciotra, & Bowen, 1998). Such integration of rich 
experiences with physical phenomena and subsequent transformation and analysis of the data 
appears to lead to robust mathematical and scientific understandings of phenomena (Greeno, 
1988; Roth & McGinn, 1998). 

To date, many science and mathematics teachers have not yet realized the potential that lies 
in situating mathematics in students’ self-directed inquiries about natural environments as a way 
to implement these NCTM standards. Moreso, there is some evidence that science teachers may 
not enact competent data interpretation themselves (Roth, McGinn, & Bowen, 1998) making it 
difficult for them to scaffold students in these practices. The present study is therefore 
fundamentally concerned with the question, “Are (preservice) science teachers prepared to teach 
through open-ended inquiry?” Specifically, we were interested in answering questions such as 
“How do (preservice) science teachers analyze a given set of data previously collected and 
presented by Grade 8 students?,” “How do (preservice) science teachers interpret a graph from 
published research?,” and “How do preservice science teachers analyze and interpret data which 
they themselves collected and transformed?” Furthermore, we were interested in understanding 
the (preservice) teachers’ performance relative to scientists analyzing the same data and 
interpreting the same graphs which were presented to them. 

Inscriptions: A Social Practice Approach to Representations 

Our theoretical approach for studying science in schools, university, and professional 
practice is informed by the emergence of anthropological, ethnomethodological, and sociological 
studies of scientists at work (Latour & Woolgar, 1986; Lynch, 1985; Traweek, 1988). All of 
these studies take a common perspective of science as a set of practices that are shared by 
members of specific communities — which is in contrast to more traditional work on science that 
saw in scientists a special breed of people who use special skills and procedures to cull facts 
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from nature. Thus, these studies of scientists at work view knowledge not as something residing 
exclusively in the heads of community members but rather as something constituted, to a large 
extent, by the ways people (e.g., scientists) go about their daily business, how they justify what 
they do, the stories they tell, and so on. 

Inscriptions are two dimensional representations of data that can then be transformed into 
other inscriptions; ultimately, they are included as tables or graphs in scientific publications. 
Inscriptions are therefore the result of scientists’ work which converts research experiences into 
a form that is easily shown to others. Using inscriptions, natural scientists have converted 
information about trees, moving lizards, soil, and screaming rats into representations which they 
can then use to help form the rhetorical basis of their claims (Latour, 1993; Lynch, 1990; Roth & 
Bowen, 1 998). Inscriptions are central to the practice of science because they can easily be 
cleaned, transformed, superposed and labeled such that they can be incorporated as an 
evidentiary base into scientists’ conceptual arguments. As part of scientists’ argument 
construction, physical phenomena are moved through series of inscriptions that may include, in 
increasing order of complexity, such re-representations as maps, lists, tables, totals, means, 
graphs, and equations. Through these transformative processes and the resulting inscriptions, 
scientists both construct and see phenomena; without inscriptions there would be few scientific 
phenomena. Thus, using data sets to produce inscriptions which can be used in publications is a 
core scientific practice (Latour, 1987) — one that it would be expected that graduates from a 
science program would automatically use as part of structuring their arguments in a scientific 
investigation. This expectation is not unreasonable given that this degree of competency has been 
documented with younger students conducting independent inquiry projects; one of our own 
previous studies documented the extraordinary competencies of Grade 8 students in constructing 
and transforming inscriptions when they conduct their own field-based research (Roth, 1996). 

The use of graphs, and other types of representations, is something that students of all ages 
have difficulty using appropriately (Leinhardt et al, 1990; Schnotz, 1993). In related studies we 
have detailed the difficulties which second year university science students had while 
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interpreting graphical representations in seminar discussions (e.g., Bowen, Roth, & McGinn, in 
press). The foundations of the students’ interpretive difficulties in seminar sessions were shown 
in a microanalysis of the text and gestures accompanying the presentation of graphs in the 
lectures for that course. This analysis suggested that the interpretive framework of the lecturer 
differed from that of the students and that this derived from different experiences at collecting 
and summarizing data and that the gestures over the graph were from one who “knew” the graph 
being unlike those which would be made by those who did not “know” the graph (Bowen & 
Roth, 1 998a). Together, these differences lay at the root of the student difficulties observed in 
their seminar. 



COVARIATION 

Scatterplots, bestfit functions, and other graphs in Cartesian coordinates are ideal for 
representing the continuous covariation of two variables which would be difficult to express in 
words. Because of its typological character, language is well suited to expressing differences and 
categorical distinctions. On the other hand, graphs have a topological character well suited to 
expressing quantity, gradation, continuous change, continuous covariation, varying 
proportionality, and other complex topological relations of relative nearness and connectedness 
(Lemke, 1 998). Graphs are sign forms which can therefore be used within particular 
communities to represent the topological and dynamic character of relationships. An analysis of 
scientific research articles from 5 journals covering over 2,500 pages showed that graphs which 
display the relationship of two or three variables are the preferred method of representation in 
science (Roth, McGinn, & Bowen, 1 997). Sociological analyses have shown that graphs are 
predominant because, in the practices of scientists, they have the greatest rhetorical power 
(Latour, 1987). Although tables could also be used to show how the concurrent associations of 
measures of one quantity vary to that of another, the relationship across the entire data set is only 
implicit in tables whereas graphs make the association immediately available in visual form 
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(Bastide, 1 990) allowing readers to note patterns in the data as well as discrepancies (e.g., 
outliers). 

RESEARCH DESIGN 
Tasks 

This study was designed to understand (preservice) teachers’ graph interpretation practices 
relative to the representations and transformations which they are expected to teach according to 
the reform document guidelines. We investigated these practices in three conditions. First, 
participants were asked to interpret a set of raw data presented on a map of the research site (Lost 
Field Notebook); the data did not easily reveal a relationship given the scatter and one potential 
outlier. Second, we asked participants to interpret a graph originally published in the scientific 
literature and which later, with modifications, appeared in textbooks and in an ecology lecture 
(Plant Distributions). Third, we presented participants with a task where they had to design their 
own investigations, collect data, transform data, and interpret the transformed data 
(Investigations). 

These three tasks represent three different levels of “authenticity” as they would be 
experienced by students in post-secondary science programs. The Lost Field Notebook problem 
represents a “school-like” task such as students encounter in problems sets from university 
science seminars and lectures (Bowen & Roth, unpublished data). The Plant Distributions task 
asks students to provide an interpretation of a graph which is similar to those they would 
encounter when reading a journal article; a common task for senior-level science students 
conducted to support research activities on which students were reporting. Finally, the 
Investigations task reflects scientific practice in that it contains the components of most scientific 
research: framing a question, operationalizing variables, analyzing data, and making claims from 
that data. Together, these three tasks represent the main components of undergraduate science 
education which would deal with developing competency in conducting scientific research. 
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The three tasks also differ in terms of the translation processes required for making claims 
about the relationships between the relevant quantities (Janvier, 1987; Roth, Tobin, & Shaw, 
1997). The Lost Field Notebook requires double transformation: first, the relationship between 
the measure has to be uncovered (e.g., using a graph, curve fitting procedure, statistical analysis, 
etc.) before the relationship can be translated into a verbal description of the situation that may 
have led to the particular data at hand. The Plant Distributions requires one translation, for the 
relationship is visually available. Finally, the Investigations task requires a complete cycle of 
activities from situation descriptions that identifies the variable categories, through measurement, 
representations, before another verbal description of the covariation can be related back to the 
situation. These translations are expressed in Figure 1. Unlike Janvier (1987), however, we 
pursue the translations not as psychological processes, but as social practices that are embedded 
in other social practices, and that are appropriated by individuals as they increasingly participate 
in communities where these practices are what everybody else is doing (Roth, 1996; Roth & 
McGinn, 1998). 



[Insert Figure 1 about here] 



Lost Field Notebook 

The Lost Field Notebook task originated in an earlier study (Roth & Bowen, 1995) where it 
was designed to test a research hypothesis about practices of data interpretation among Grade 8 
students engaged in a 10-week field study of different ecozones. The representation of the data in 
the map is a facsimile from the notebook of one Grade 8 student involved in the study. We wrote 
a stem that situated the data in the same context in which the children worked at the time in order 
to assess their data transformation practices using a problem that was as ecologically valid as 
possible. For the purposes of the current study, we selected one of the forms containing 8 plots 
and therefore 8 pairs of numbers (Figure 2. a). The graphical representation of the data in a 
Cartesian graph shows the ambiguity of the relationship (Figure 2.b). We chose this particular 
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problem for at least three reasons. First, its apparent correspondence to a plausible experience 
seemed strong (even our scientists never questioned the authenticity of the data). Second, the 
problem is equivocal even for individuals with much more experience in research (e.g., graduate 
students). As Figure 2.b shows, the correlation changes from a nonsignificant to a significant 
relationship when Point C is considered an outlier and dropped from the analysis. This change in 
significance promised cognitive conflict (and discussions between pairs of participants 
discussing the task) and, for us, an opportunity to study sense-making over and about those 
representations that participants constructed to support their arguments. Third, the problem was 
interesting because it is quite similar (in the scatter of the data) to scientific data sets as they 
emerge from ecological field work (Roth & Bowen, 1998) and in ecology research journals 
(Roth, McGinn, & Bowen, 1 997). 



[Insert Figure 2 about here] 



Plant Distributions 

The Lost Field Notebook problem required some form of transformation before any 
conclusions about the natural environment could be made. Problems in the interpretation 
potentially arise even when the covariation is already represented in graphical form. We 
therefore chose a second task with a similar underlying variation (i.e., plant density as a function 
of a physical variable). To ascertain closeness to scientific practice, we chose a graph from the 
ecological literature (Eickmeier, 1978), but modified it in ways similar to those used as lecture 
material in a university lecture course (i.e., clarified captions, reduction of variation in major 
trend-line patterns; Bowen & Roth, 1998a). The original research by Eickmeier was conducted to 
show that, consistent with a theoretical model about adaptation and niche exploitation, different 
photosynthetic mechanisms allowed plants to best thrive in different climatic conditions. C3 
(Figure 3) is the simplest, but most water consuming mechanism based on a one-cycle chemical 
process. The C4 mechanism conserves water by adding another cycle of chemical processes. The 
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CAM mechanism is similar to C4, but the second cycle occurs in separate cells, so that gas 
exchange associated with the first process can occur at night; this process is separated from the 
second one which occurs during the day thereby minimizing water loss through the pores. 



Insert Figure 3 about here 



Modification of the inscription occurred in two ways. First, several local minima in the 
functions were eliminated to make for more continuous curves. Second, the temperature and 
moisture gradients were plotted above the graphical display. We used a caption similar to those 
found in the scientific literature, and added a reference to the literature so that respondents could 
see that the graph had come from the scientific literature. In this way, the graph was not unlike 
those several hundred identified and analyzed during a previous study of five ecology journals 
(Roth, McGinn, & Bowen, 1997). Participants were asked by us to describe how they interpreted 
the graph and to provide us with their understandings of what it might represent. 

". Authentic ” Investigations 

Responding to tasks provided in the form of the previous two problems, though considerable 
context had been provided, can be criticized as too school-like in that the data and 
representations are preframed (Lave, 1992; Roth, 1996). We therefore asked one subset of our 
participants to design and conduct an investigation in which correlations between biotic and 
abiotic features of the environment were to be studied. They were told that the investigation 
should be framed in the form of two focus questions and include relationships based on some 
form of quantitatively measured variables. The students were to report their results using a 
scaffolding device, the Epistemological Vee (Novak & Gowin, 1984), to which they had been 
introduced previously. This device explicitly prompts users to state research questions, provide a 
brief description of their research method, report data, transform the data, and state claims based 
on the data. Because users are required to state their prior knowledge also, they can, after the 
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fact, assess their learning in the process of the inquiry. We asked two of our scientist participants 
to comment on selected case analyses of reports written on these investigations. 

Research Participants 
Preservice Elementary Science Teachers 

These participants were enrolled in a Western Canadian university in their last year of a five- 
year elementary education program and had chosen science and mathematics as their subject 
matter specialty. They had taken a number of related courses, beyond the minimum required, in 
order to receive their specialist degree. The 10 preservice teachers (7 female, 3 male) constituted 
the entire class of an advanced science curriculum course, the only one offered during that school 
year. Nine of these preservice teachers had above-average GPAs in the elementary education 
program. (All pseudonyms start with the letter E to indicate students in the elementary education 
program.) 

Preservice Secondary Science Teachers 

These participants were enrolled in a secondary science teacher preparation program in a 
different university in Western Canada which accepts applicants only after they have previously 
completed a bachelors degree. All 25 students (10 male, 15 female) had previously completed 
undergraduate degrees with a major either in science (22 students), mathematics (2 students), or 
in the arts (1 student). Four students had obtained post-graduate degrees in: veterinary medicine, 
mechanical engineering, chemistry, and law; they also had work experience in their respective 
domains. (All pseudonyms start with the letter T for teacher.) 

Practicing Science Instructors 

Four science teachers (2 high school teachers, 2 university instructors) all with a B.Sc. 
degrees (ecology, 2 biochemistry, physics) participated in the study. Three had participated in 
research as assistants, but none had conducted independent research for the purposes of 
publishing the results of their studies. Their experience ranged from first year to more than 20 
years of teaching. (All pseudonyms start with the letter I for instructor.) 
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Practicing Scientists 

Over the past two years, we have asked 25 practicing scientists to interpret various scientific 
representations, including different sets of data and graphs. All sessions were videotaped and 
transcribed. For the present purposes, we included 1 5 individuals, 1 0 individuals who responded 
to the Lost Field Notebook problem and 10 individuals who responded to the Plant Distribution 
Graph (with 5 individuals doing both). The individuals had a minimum of five years of research 
experience and at least a M.Sc. degree. The domains of their work differed widely including 
ecology, entomology, marine biology, physics, chemistry, and forest engineering. (All 
pseudonyms start with the letter S for scientist.) 

Task Distribution 

The participants contributed to different extents, formats, and social settings in our data base. 
The distribution of think aloud, group sessions, and written task environments across the 
different participant groups is shown in Table 1 . 



[Insert Table 1 about here] 



Data Sources and Interpretations 

The present study was developed from a data corpus that includes (a) videotapes of 
individuals (scientists, science teachers) and groups (preservice elementary science teachers) 
solving the Lost Field Notebook problem and interpreting the Plant Distribution Graph; and (b) 
written solutions by individuals (Lost Field Notebook) and groups (Authentic Investigation) 
from the preservice secondary teacher population. 

Our interpretations inscribe themselves within the larger context of studies on the 
interpretation of scientific representations from middle school to professional practice; our 
studies draw on semiotics of scientific texts (Bastide, 1990; Eco, 1984) and interaction analysis 
(Jordan & Henderson, 1995) as the major methodological frames. We analyzed the data 
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individually (in part to later assess the robustness of our categorizations) and, later, in 
collaborative sessions. In daily meetings, we generated assertions and tested them individually 
and collectively in the remainder of the data base. The transcripts and videotapes were taken as 
occasions for construing the public work done of providing a solution; in the cases where we had 
videotapes of pairs, it was expected that, if there was any trouble during the interpretation, the 
participants would try to remedy the breakdown by talking to each other. Our transcripts were 
therefore protocols of individuals’ and groups’ efforts in making solutions to the tasks as they 
understood them accountable to the researchers or to each other. 

FINDINGS I: INTERPRETING RAW DATA 

Our overarching question was whether (preservice) teachers enacted the scientific and 
mathematical practices described by reform documents (NCTM, 1989) in appropriate situations. 
Specifically, our first question asked “How do (preservice) science teachers analyze a given set 
of data previously collected and presented by Grade 8 students?” To contextualize the answers 
by (preservice) teachers, we present scientists’ responses to the same task. 

Scientists’ Readings 

If you possibly plotted out the graph, then did a linear regression 
on it, you might see an R 2 value that actually makes sense. 

In the course of our inquiry, we asked ten active researchers working either at a university or 
in the public sector, and all of whom had M.Sc. or Ph.D. degrees, to examine and provide an 
interpretation for the Lost Field Notebook problem. Without exception, these participants ended 
up plotting the data, proposed regression analysis to test goodness of fit, discussed an outlying 
data point, and suggested the collection of additional data to increase the power of the statistical 
analysis. The scientists were unanimous that, to make a convincing claim, they had to plot the 
data and provide statistical indicators about the strength of the relationship. Providing a data plot 
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and the statistical information would be their way of supporting the claims. Without exception, 
all practicing scientists indicated that there appeared to be a relationship which should be 
substantiated by statistics and collection of further corroborating data. 

Scanning the Map 

Three scientists, after reading the story plot, immediately, without scanning the data and 
without hesitation, suggested plotting the data and subsequent statistical analysis. The others 
engaged in a more lengthy process of scanning the map, making tentative claims, plotting the 
data, and then conducting their analysis followed by statement of claims. The difficulty in our 
analysis lay in assessing what occurred during the first few seconds of seeing the map, because 
few participants verbalized what they focused on. But a few did. In the first reading, things 
become salient, that is, the reading establishes a domain ontology. Some scientists noticed the 
irregular plots, but this aspect did not enter their interpretion at all. 

Scanning for extreme cases and data points at “opposite ends” of the data range was a 
common practice. Thus, some scientists began by seeking those areas with the lowest light 
intensity or bramble density, and then moved to identify those with the highest values on the 
same variables. 

I’m looking at, say just these three which were the lower ones in this comer [top right], 

750 to 500, and then looking at these three [D] [H] [E], 12, 15 are of the two highest 

levels and these [C] [F] are the two lowest levels. [Stu] 

As they scanned the map, scientists noted the potentially discrepant data. But rather than 
using these data for drawing conclusions, this noting was simply part of establishing the domain 
ontology, which also included other aspects such as the irregular size and boundary of plots, the 
absence of “edge effects,” the differences in the size of the plots, or the identification of those 
plots in which the extreme cases on either variable were located. 
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Tentative Claim 

The first, tentative claims after scanning the map were not consistent. In equal numbers, the 
scientists initially suggested that there was and was not a pattern, that is, a relationship between 
the two variables light intensity and bramble density. A typical statements was: 

So, at first glance, it would seem that there is not much of a pattern or a relationship 
between foot-candles and percent coverage by brambles. [Steve] 

At this point, rather than using individual data points for or against their claims, scientists then 
proposed to plot the data. Some immediately proposed subsequent statistical analysis to find 
correlations, and then outlier analysis. 

If you possibly plotted it out, plotted out the graph, then did a linear regression on it, you 
might see an R 2 value that actually makes sense, that’s why I would plot this data if I 
was, wanted to see a pattern. So just looking at it like this, doing a linear regression 
plotting percent cover versus light intensity, see if there is a line there, then calculate R 2 
and if we did that we probably would see some kind of a pattern with increasing cover 
and increasing density, so, more light equals higher density of brambles. [Stu] 

Data Plot and Analysis 

Scientists plotted the data, and with one exception, used light intensity on the abscissa and 
bramble density on the ordinate. After the data were plotted (as in Figure 2.b), scientists were 
unanimous about the (weak) relationship between the two variables. They then engaged in an 
analysis of discrepant data. For example, after having suggested that there “vaguely was one” 
relationship, Sally assessed the effects of possible outliers. 

Take that out [C], take that [D] out. Just to remove outliers, so if you remove an outlier to 
see if there is, if it’s a single point that sort of driving the whole relationship. So if you 
take that [C] one out, it’s not bad. But like this one [D] up here, if you take that out, I’d 
say . . . you’re grasping at a relationship. And if you take that [H] one out, it doesn’t 
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change it too much. I would go to this point [F] is that 500, 0, no, this one [C] is 500 foot 
candles, 30%, see that one [C] looks a bit suspicious because there is so much variation 
between those two. [Sally] 

After this analysis, Sally concluded that there was a “positive relationship between foot-candles, 
or the amount of light they get and how many brambles there are.” 

There was only one scientist who proposed a curvilinear relationship. In contrast with the 
other scientists, he plotted light intensity over bramble density which provided him with a 
different perspective. He drew a best fit curve which was parabolic and then explained (Figure 
4): 

The only pattern I might see is that pattern GESTURES[parabola], somewhat like that, 
but not a super strong one. That suggests that there is some intermediate light level at 
which bramble coverage is greater. So I might claim that brambles have a optimum light 
level intensity in which they grow and reproduce optimally at, and the higher or lower 
light levels, their growth and reproduction is decreased. [Steve] 



[Insert Figure 4] 



In dealing with the outliers, scientists suggested the collection of additional data, checking 
whether there were copying errors from a notebook, or seeing if there’s “something weird about 
that region that results in either high ones, that resulted in a really high percentage with such a 
low.” One scientist proposed running consecutive regression analyses: 

There are statistical tests that can be used, curve fitting. The simplest one is straight line 
relationship, the R 2 statistics tells you how well the best fit straight line through a series 
of data points fits and now you can run that leaving certain data points out or leaving all 
the points out sequentially and seeing which one gives you the best R 2 or the best fit. 

[Stu] 
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Another scientist proposed the use of statistical indicators such as Mahalanobis and Cook’s 
distances which can assist in deciding whether an outlier significantly affects the relationship, 
and whether or not a data point could be dropped. (Many statistical packages have this option.) 

Other Factors 

Our prior research suggested that many non-scientists seek to explain the variation in both 
variables by drawing on explanatory resources outside of the written problem itself. That is, 
drawing on personal experience, they invoked other variables that might explain the particular 
data set in front of them. This was corroborated in the present study among the non-scientist 
individuals. On the other hand, scientists were only marginally concerned with other possible 
factors. Usually these concerns became evident before they actually plotted the data. For 
example, one scientist, after the first scan of the map, suggested that two plots [C,F] had 
particularly low light intensities which he thought were possibly due to shading by other trees. 
Another individual suggested that a water source at the western edge might be a mediating 
factor. 

Suggestions for Improving Elizabeth ’s Study 

Scientists were almost unanimous about the fact that the number of data points should be 
increased, though at least one suggested that she herself had conducted and reported research 
based on 12 data points. Another common suggestion was to try and work with plots of equal 
size, though the scientists also realized that density was a relative measure and light intensity had 
been averaged across the plots. One scientist also suggested that it might be better to work with 
the absolute numbers of brambles in areas of normed size, but was uncertain whether this would 
improve the quality of the measure. 

You could actually calculate the absolute amount of brambles, which might be a better 
measure. I mean, ideally it might be better trying to layout defined, like areas of equal 
size. [Sally] 
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Think Aloud Protocols bv Instructors 

There has to be another variable involved in what 's happening here 
because a direct correlation between light intensity and percent 
density of the bramble doesn 't seem to hold true. 

Four instructors (2 university, 2 high school) with B.Sc. degrees were asked to think aloud as 
they completed the Lost Field Notebook problem. All four, without exception, inspected the data 
and, without any transformation, claimed that there was no relationship between light intensity 
and bramble density. 

I mean it seems, you know, the higher [D], the higher light higher coverage, but then 
when you look at like between 200 [D] and [E], between 1200 [D] and 1500 [E] it looks 
like that but then when you look at this one [H], well, that’s not very high, so why not? 
like [D] [E] it doesn’t [H]. [Ina] 

Tentative Hypothesis and Testing 

Three of the four instructors engaged in cycles of explicitly verbalizing at least one tentative 
hypothesis, and then rejecting this hypothesis based on an analysis of individual data points. 
“High percent, lots of light [D], low percent, lower light [G] higher light and higher percent [B]” 
(Ike) “it seems, you know, the higher [D], the higher light higher coverage” (Ina). 

One could argue that all brambles need light but then that’s defeated by the fact that 
we’ve got light 500 foot candles here [F] and no brambles at all. One could argue that 
brambles need more than 500 foot candles to grow but that’s [C] defeated by the fact that 
you’ve got 30% incidence with the self same 500 foot candles that over here [F] was not 
growing anything. [Ira] 



[Insert Table 2 about here] 
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Three of the four used pairwise comparisons of data points (Table 2). In a few instances, 
three [D,C,E] and five areas [A,B,F,G,H] were clustered to obtain geographical patterns. In these 
instances of comparisons, two types of comparisons were used, within variable comparisons and 
between variable comparisons starting either with the light intensity or the density comparison. 
Usually, this pattern was used to show exceptionality, that is, for an equal or similar value in one 
variable there was a drastic difference in the measures of the other variables such as in the 
comparison of [C] and [F], 500:500:::30:0. 

One person (Ira) used the trend within a pairwise comparison as a counter argument against 
an overall trend. Thus, whereas the light intensity increases going from Plot D to Plot E 
(1200:1500), the opposite trend is observable for the coverage (40:30). This was interpreted as 
indicating a negative relationship held against an overall positive correlation. 

Two individuals [Ira, Ian] crossed the arguments. For example, then comparing the areas [C] 
and [D], the argument ran 40: 1200::: 1250: 15 (“40 is what you were seeing at here [D], 1200, 
while this [H: 1000] is down to 15% here [H: 15]”). 

Two individuals [Ian, Ina] considered three data points as they searched for consistency 
among data points. Ian compared the data set [H,A,D] in both a between 
[1250: 15::: 1000: 10::: 1200:40] and within [1250: 1000: 1200::: 15: 10:40] condition concluding in 
both cases that [D] was a discrepant point with respect to coverage. His other three-point 
comparison consisted of the set [DCE] for which he used a within comparison 
[40:30:30::: 1200:500: 1500] to conclude that [C] showed a discrepancy with respect to light 
intensity. Ina tested the hypothesis “higher light, higher coverage” and then proposed the set 
[DEH] to reject it [ 1 200: 1 500: 1250:::40:30: 15] because the coverage in [H] was low. 

Although each individual made a number of comparisons, when they were asked what they 
claimed and how they supported it, they generally used one example to contradict the 
relationship between light intensity and bramble density. 
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Pattern Map: “ There is Something through this Area” 

As they abandoned the search for relationship between the two variables, individuals 
proposed geographical pattern in which the western edge [D,C] and Plot [E] with high bramble 
densities were opposed to the low densities in the remaining areas. 

I have a hard time saying the more light, the more brambles ‘cause that’s not entirely 
true. It’s almost as if there is something down through this segment CIRCLES 
[G,B,F,H,A] of the land here that’s just decreasing the amount of brambles, and this [E], 
and this [D], or this [E] is an erratic, I’d be curious if something happened on this side 
POINTS [right map boundary] [Ian] 

One person suggested that Plot E may be an outlier to the general pattern of the east-west (right- 
left) geographical pattern of bramble density. Thus, whereas we initially assumed pattern maps 
to be an independent strategy (e.g., Roth, 1996), the present data suggest that participants only 
engaged in this practice after exhausting other options and after suggesting that a covariation 
does not exist. 

Other Factors 

All four proposed that any weak relationship was spurious and that factors other than light 
determined the density of brambles. 

But the thing that she is actually measuring is the differences in soil quality, for example, 
or differences in water in the different areas. [Ina] 

Whereas the physics instructor (Ian) did not get into any specific alternative, the others proposed 
a variety of factors including water, soil quality, soil characteristics (such as a rocky outcrop 
underneath Plot F), and seed distribution. One person (Ina) also suggested that more data should 
be collected in order to make more founded claims and check whether the distribution of the 
plants within each plot is fairly homogenous or whether the plants come in clusters. Another 
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individual (Ira) thought that, because of the small size of the area covered by the map, there 
might possibly be considerable experimental error in the determination of the bramble density. 

Preservice Teachers’ Readings: Prior Work 

A pilot study (N= 17) and an initial survey ( N= 32) (Roth, McGinn, & Bowen, 1998) based 
on written tests showed that only a small fraction of secondary preservice teachers (5 of 49), 
despite their prior B.Sc. and M.Sc. degrees (most of them in biology), used graphical and/or 
statistical analyses when responding to the Lost Field Notebook problem. Statistical comparisons 
revealed that there was a significantly higher proportion of Grade 8 students (who solved the 
problem in pairs) who used graphical and statistical analysis methods than secondary preservice 
teachers. Having classified responses into more abstract representations (graph, averages), less 
abstract representations (ordered table, pattern map, list), and no transformation (language- 
based), we detected a significant effect (x 2 (2) = 6.80, p < .05). There was a lower incidence of 
more abstract representations among preservice teachers than among pairs of Grade 8 students. 
Furthermore, there was a relationship between the type of analysis and the type of claim 
respondents made. A logit analysis — with type of claim (correlation, no correlation) as 
dependent variable and type of representation (more abstract, less abstract, none) as independent 
variable — showed that an equi-distribution model had to be rejected, £ 2 (3) = 16.42, p < .001. 
Analyses by respondents based on statistical and graphical methods generally suggested a 
positive correlation between light intensity and bramble coverage, whereas analyses based on 
other methods generally ended in claims that there existed no relationship between the two 
variables. 

In the present study, we had two objectives. First, we wanted to collect verbal protocols of 
individuals and pairs to better understand the processes by means of which (preservice) teachers 
arrive at the particular claims and how they select the method for supporting their arguments. 
Second, we assumed that preservice teachers in the previous study, despite their scientific 
training (and B.Sc. degrees), did not use graphical (or statistical) analysis because they had not 
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recently engaged in activities in which drawing graphs and doing statistics is “what is normally 
done” and “What everyone else does.” We expected that the frequency of graph use would 
increase if the participants were primed. We therefore repeated our earlier studies with preservice 
secondary science teachers but in a new condition: We primed participants immediately prior to 
the Lost Field Notebook with an activity that required them to answer the question, “How does 
the height from which you drop a ball affect the bounce?” by collecting and recording data, 
transforming the data into a Cartesian graph, and drawing conclusions from this graph. 
Specifically, they were asked to construct a scatter plot and to base their interpretation on this 
plot. Participants recorded the entire activity using the “epistemological vee” (Novak & Gowin, 

1 984) as a scaffold which provided prompts for them to engage in particular steps, from question 
to design, data collection, data transformation, analysis, and statement of claims. However, we 
also expected (based on our 20 years of combined experience teaching science in middle and 
high schools) that, because they had little prior experience in data analysis, at least some 
participants would reject a relationship between the variables in the LFN problem because the 
data did not fall on a (straight or curvi-linear) best fit graph. 

Individual Written Answers after Priming (Preservice Secondary Teachers! 

The missing field notebook exercise was very difficult for me. 
As in the previous study, and despite their science degrees and the priming, preservice 
secondary science teachers found the Lost Field Notebook activitydifficult. One of the 
individuals who produced a data plot with a line of best fit suggested: 

It is very clear to me that I was taught science as a collection of facts, not as an 
exploration. This exercise was very difficult for me. I can see its usefulness already. I 
think it is important to have this kind of “thinker” exercises included in the curriculum. 
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Another person suggested, “What was that Lost Field Notebook exercise all about? I couldn’t 
make any sense of it. Now I really feel like a non-science type” [Tandy]. 

Drawing on Latour (1987) and our own prior work, in this study we categorized answers 
along a continuum {no transformation (verbal) — >Ordered Table — > Ratios, Data Plots, Data 
Plots + Bestfit}. Table 3 shows that, possibly as a result of the priming activity, a large fraction 
of participants drew graphs (44%) compared to our previous studies. However, there were only 7 
individuals (26%) who used lines of best-fit (two with outlier analysis) in the way we observed 
the scientists use them. Furthermore, we found an almost clean break between the claims made 
by those participants who plotted data accompanied by best- fit and outlier analysis and all other 
solutions, including those that had only plotted the data: Those who claimed that there was a 
relationship in the complete set of the data had all used in their analysis a line of best-fit. 
Generally, there was a much larger number of enumerations and discussion of other possible 
factors that determined bramble density among those responses that did not use plots and lines of 
best-fit and therefore claimed that there was no relationship between the two variables. There 
were only 7 cases where quantitative comparisons between two data points were made, 6 of 
which were related to a comparison of Plots C and F. 



[Insert Table 3 about here] 



Data Plots, Bestfit Lines, and Outliers 

Figure 5 shows one of the solutions in which data are plotted, a line of best fit drawn, and the 
analysis of one data point as an outlier. Four of the pre-service secondary teachers also 
constructed a table which appeared prior to the graph on the answer sheet; two individuals 
initially suggested on the basis of the table that there was no relationship, one person (Ph.D. in 
mechanical engineering) disregarded the table in favor of the graph, and the fourth person had 
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constructed a table in which the light intensities for same- value coverages were already averaged 
permitting the conclusion of a positive correlation. 

[Insert Figure 5 about here] 



One individual, before plotting the data, prepared a data table ordered according to the 
percent coverage but for each value, averaged the associated light measurements. She then 
plotted the reduced number of data points (5), produced a line of best fit and concluded that there 
was a positive relationship between the “% brambles and # fc.” 

The one person who concluded that there was no relationship despite having drawn a line of 
best- fit, initially began with an ordered table. She argued that the “variance from the line of best 
fit suggests an inconclusive relationship. . . supported by the fact that both 0% and 30% have a 
value of 500 foot candles” (Tora). In this, her argument was similar to those by the individuals 
using data plots only without accompanying lines of best- fit and outlier analysis. 

Data Plots Only 

When participants used data plots but without accompanying lines of best fit, the claim in all 
cases was that a relationship did not exist (Table 3). Of the five claims, three were supported by 
citing discrepant data points, the remaining two simply by referring to the scatter of the data that 
did not permit the attribution of a clear relationship. 

It would appear that an increase in foot-candles in and of itself does not consistently 
result in an increase in brambles. Rather, it would appear that the amount of outside 
(presumably unobstructed access to light) area is indicative of the increased brambles. 

For example, if you compare the outside unobstructed light of the one with the smallest 
amount [F], the density is 0% versus the one that is in the triangular area [C] which has a 
larger unobstructed area having a density of 30%. [Tabby] 
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One student [Tanya] split the entire field in an upper and a lower area and produced plots for 
each set of four data points separately. For the plot containing the data of the upper four areas 
{D, G, B, F}, she claimed the existence of a relationship whereas in the case of the remaining 
data, she claimed that there was no relationship. An analysis of the two graphs shows that in the 
first instance (Figure 6. a), the data can be thought of lying on a curve, whereas this is not the 
case for the second plot (Figure 6.b). This analysis further supports our claims that participants 
who have not engaged in science as daily routine activity tend to assume that relationships 
between variables have to be ideal in the sense that data points fall on (curved or straight) lines. 
If this is not the case, as the differentiation between Figure 6. a and 6.b shows, a relationship is 
not defended, or participants argue that the relationships are mediated by some other variable. 

All of this suggests a deep-seated assumption and mundane sense — which has existed since the 
early Greek philosophers — that nature and mathematics are isomorphic, that is, that the world is 
fundamentally mathematical. Thus, if there is not a ‘clean’ relationship between two 
variables — if all data points do not fall onto a line — it is assumed some other variable is 
mediating the relationship or that a relationship just does not exist. 



[Insert Figure 6 about here] 



Ordered Table 

Five participants constructed tables of ordered values; all individuals ordered their tables on 
the basis of the approximate coverage; one individual also constructed a second table in which 
the data pairs were ordered according to the second variable. 

There is not an overriding correlation between the light and density of brambles. Areas 
with 1250 fc and 1200 fc respectively, have 15% and 40% density, respectively. 

Question: Could soil content or pollution, slope, drainage, have an equally strong effect 
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on plant distribution? The proximity of the areas of low density would indicate a spill 

(killer) or type of adjacent soil that does not enhance growth. [Tammy] 

Four of the five individuals suggested that either the investigation needed to be re-done or that 
additional measurements on other variables possibly mediating the relationship were necessary 
(including nutrients present, soil type, moisture, moisture retention, animal predation, pollution, 
slope, and drainage). 

No Data Transformations (Verbal) 

Without a transformation of the data into some other mathematical form, it is difficult to 
make claims about the relationship between two variables under consideration, and therefore 
contribute to the construction of a phenomenon (which here would be light fosters plant growth). 
In contrast to our earlier research which had shown that both preservice secondary science 
teachers and Grade 8 students split their claims with respect to the existence of a relationship 
(yes/no), in this study all respondents who did not transform the data claimed that a relationship 
did not exist between light intensity and bramble density. The following answer was provided by 
Tilson (honors B.Sc. in biology, environmental science) 

• It is difficult to draw conclusions on patterns from these field notes because she has 
broken the data up into small sections — so it is difficult to make conclusions. 

• I don’t perceive any patterns between % of bramble cover and the amount of light. 

• The use of % bramble cover is misleading because it is referring to different area 
sizes. 

• The highest bramble coverage seems to be along the left and bottom sides of the 
study area — perhaps this is an edge of some kind, or perhaps there is a path of 
bramble running along this edge. The light is strongest along this edge as well, except 
for along the weird slanted side — perhaps this is a building or wall which is blocking 
the light. 
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In addition to the problems of perceiving patterns from the raw data on the map, this participant 
also argued, in contrast to general practice, that the relative coverage is a function of the plot 
size. However, it is not clear whether in this case the argument drove the claim or if the argument 
emerged after a pattern was not detected. As many other individuals who claimed that there was 
no direct relation between the two variables, Tilson then sought patterns in the geographical 
distributions and then hypothesized about possible natural features (i.e., other factors) that might 
cause such a distribution. 

Other solutions 

Two solutions did not fit into our previous scheme and were, because of their limited 
frequency, categorized as “other.” One individual re-drew the map to scale including three cross- 
sectional lines and beneath it, plotted the average bramble coverage against location. In this way, 
she engaged in the construction of “transects,” a common practice in ecological field work 
related to plant distributions (see next section). 

Collaborative Readings by Preservice Elementary Teachers 

We don ’t know enough information to make many patterns. 

Among the preservice elementary teacher pairs we found similar claims and quantitative 
arguments as among the preservice teachers (secondary) with bachelor degrees. However, as is 
seen in Table 4, the number of quantitative comparisons was lower (in relative and absolute 
terms), and between (rather than within) strategies were predominantly used. As before, the 
comparisons between CF, DE, and DH made for the bulk of the numerical comparisons. One 
individual proposed the existence of two subareas, in each of which there was a different kind of 
relationship. 

Yeah, GESTURES[D< — >B] like if it was to say that these are all correlated there’s 
some kind of connection with these at the top or there is something at the bottom, it just 
don’t go either, I mean, ‘cause 10% should be 750 whereas 40% is 1200. [Ema] 
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[Insert Table 4 about here] 



In contrast to the instructors with B.Sc. degrees, the preservice elementary teachers made 
many comparisons in which one or both measures were compared on qualitative grounds. 

Because here’s 1250 [H] and 1200 [D], which are very similar, and there’s [H], that’s [D] 
like more than twice as much. This [C] is 0 [F] and 30 [C] right, so we can’t really see a 
pattern between the light and the percent of brambles. [Etta] 

You’re going up here [H,G] in this stretch And you go from pretty much the same 
amount of coverage, pretty close, but there’s a huge amount of light difference. [Ella] 

In these cases, the comparisons are not based on ratios. Erin first compared the coverages of C 
and D, and achieves as result “10% less.” She then compared this to the “half the amount of 
light.” In Etta’s case, the H and D areas are “very similar! in terms of light intensity, which is 
compared to the “more than twice as much” in coverage. The argument of similar then carries 
over into the comparison of C and F (each 500 foot candles), but with drastically different 
coverage. Ella’s argument also rested on a comparison of similarity in one measure (they are 
both 30%), whereas on the other measure, “[E] got way more light than [C].” 

“Maybe It ’s a Pathway or Parking Lot or Something” 

In four of the five groups, numerical and qualitative comparison of the measures 
predominantly occurred during the first half of each session. Thereafter, the task definition 
appeared to change from seeking a relationship between the variables to one in which students 
attempted to explain why Elizabeth might have obtained the particular measures she had. When 
it gets complex problem solvers, whether copier repair people or economists, appear to use 
narratives (Bruner, 1986; Orr, 1990). Explaining the geographical distribution of the light and or 
bramble coverage: 
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I think there must be a pattern in here maybe with the source of light, because if we’ve 
got, for some reason, it seems to be going this way [E — >D] and then when it breaks of 
that [west] way then [east] it’s less, you know what I mean, because we got 15 [E], a 
1000 [A], a 1250 [H] and a 1200 [D] and then when you go. [Erin] 

The groups generally focused on the geographical distribution of light intensity and bramble 
density. In four groups, participants elaborated on possible effects due to the movement of the 
sun, blockage of light by objects (hill, rocks, fence) or plants (trees, brambles in neighboring 
plot). One major concern in these groups was the lack of brambles in Plot F leading to varying 
reasons being proposed: sidewalk, compost heap, cement patio, rock outcrop, parking lot, yard, 
pond, and rocky cliff or slope were proposed as possible features that did not permit brambles to 
grow in this plot. Among the factors considered more generally which might mediate the growth 
were differences in soil quality and type, depth of soil, ground water, water received through 
rains or sprinkler systems, and competition by other plants (weeds or trees) crowding out the 
brambles. 

‘‘How did they measure the foot candles of light? Questions of Method 

Three respondents wondered about the areas and suggested that their shape was possibly 
determined by the percent coverage. For example, some wondered whether the shape of Plot D 
was determined by that area in which the average coverage was consistently 40% (Erin, Eli) or 
because the plots are “separated by the among of light they receive” (Eva). Ema suggested that 
in her experience, sampling areas were either round or square, but never irregularly shaped. 
Others suggested possible features such as pathways (Ema, Ed) that might have determined the 
particular shape of the plots. In two groups, participants asked for reassurance that the maps were 
correct and whether these maps actually corresponded to the research area. There was also a 
question whether the light measurement was read appropriately from the instrument. 
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Discussion of LFN Solutions 

This part of the study showed that whereas scientists all defaulted into the same practice 
(transformation of data into Cartesian plot, statistical analysis, outliers), the (preservice) teachers 
enacted these practices only when primed with a similar activity. Even then, only a minority 
(26%) engaged in best-fit or trend analysis. That is, at this point, most (preservice) teachers do 
not enact the default interpretive practices which we observed among the scientists. One evident 
difficulty affecting the teachers’ interpretations was that the data did not fall on a neat line but 
were scattered. Variation of one variable for the same or similar values of the other were used as 
evidence to argue that covariation did not exist in the data set. 

The most discussion of individual data points (quantitative and qualitative taken together) 
occurred among the pairs of preservice elementary teachers (5.2 per group); fewer among the 
instructors (4.0 per individual); and least among the written answers from the preservice 
secondary science teachers (0.4 per individual). An analysis of how the comparisons of the data 
points were deployed in the argument shows that the predominant number of these (3 1 
numerical, 9 qualitative) were used to argue that with same or similar measures on one variable, 
there was considerable variation on the other (Table 5). Fewer comparisons were used to either 
argue that there was a pattern of the type low lighttiow coverage: ::high lightihigh coverage (7) or 
that there was an inverse relation indicated: When two data points are compared, an increase in 
one variable associated with a decrease in another variable was made three times. 



[Insert Table 5 about here] 



Arguments against a relationship between light intensity and bramble density based on the 
comparison of individual data pairs shares similarities with the model-based reasoning employed 
by college students on algebra story problems (Hall, Kibler, Wenger, & Truxaw, 1989). Our 
participants in this category reasoned directly within the situation glossed by the problem rather 
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than relying on mathematical formalisms. They proposed a relationship and then used specific 
instances in which the hypothesized pattern was violated, or used a specific instance as counter 
argument for a relationship. Table 5 indicates that, if qualitative and quantitative comparisons are 
considered together, there was a considerably larger number of within variable comparisons than 
between or cross variable comparisons (x 2 {2) = 22.8,/? < .0001). 

Our participants’ search for the firm association between variables is not something that 
should be attributed to some cognitive deficit, for there are long-standing traditions among 
scientists themselves whereby firm, ideal associations are thought to underlie worldly 
phenomena. Early astronomers, and particularly Ptolemy, added an increasing number of circles 
(epicycles) in order to maintain a model of the universe based on circles. Just as our participants 
introduced additional factors to try and clarify relationships, Ptolemaian astronomers added 
additional epicycles to bring their models closer to the data points. Furthermore, recent 
evaluations of the research on the effect of cholesterol was mired in long, never closed 
controversies because scientists believed that close associations should exist, but no research 
ever could establish a clear relationship: 

The scientists conducting these studies were looking for the sort of diagnostic signal 
which was characteristic of pre-World War II medical success stories — that is, a certain 
blood cholesterol level that was as firmly associated with heart disease as was the 
tubercule bacillus with tuberculosis or high blood sugar with diabetes. (Garrety, 1998, p. 
733-734) 

Thus, underlying the discourse of many of our participants is an epistemology that the world can 
be mathematized in a way that makes for perfect explanations of the data (granted that they are 
“good” data). However, we do not claim that people actually “hold” such beliefs. Rather, even 
people who have never thought about these relations, drawing on cultural resources to which 
they are consistently exposed (such as the media), and possibly because of “common sense,” will 
make claims that are consistent with such an epistemology. The claim that scientists believe in 
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the isomorphism of nature and mathematics (Lynch, 1991) should be expanded to include at least 
those populations from which our participants originate — (future) teachers of science. But 
whereas scientists know from (research, laboratory) experience that data almost never fit ideal 
lines, our participants did not have such experiences. Thus, our explanation for the answers 
provided by our research participants focuses on the differences in the habitual practices in 
which the different participants engaged rather than in differences in cognitive ability. This 
contention is further elaborated in the next section which shows that even practicing scientists 
may experience difficulties when it comes to interpreting line graphs that do not come from their 
own domain. 



FINDINGS II: INTERPRETING TRANSFORMED DATA 

Our second research question concerned the interpretation of data when these were already 
expressed in the form of a graph, that is, “How do (preservice) science teachers interpret a graph 
from published research?” The Plant Distribution graph (Figure 3) is one which originated in the 
scientific journal literature, but which is also found, in a transformed fashion, in undergraduate 
science textbooks and lectures (Bowen & Roth, 1998a). Thus, asking participants with science 
backgrounds to interpret such a graph is an “authentic” activity in that it is one in which they 
would normally engage, as part of their reading of scientific writings, as they learned about 
science. This particular graph is also conceptually consistent with the Lost Field Notebook task 
as both deal with a correlation of two measures. However, they are also different tasks in that in 
the Plant Distribution task the transformed representation is already complete and has been 
“cleaned” so that variation is minimized and the best-fit lines are generally consistent with the 
caption. By adding the caption, we constructed a task that had a high degree of similarity to the 
ordinary activities of scientists than would using a graph without the caption. Even those 
participants who had little familiarity with journals inferred that the scientific literature was the 
source of the graph. As one preservice elementary teacher suggested, “Maybe there is an article 
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that goes along with it where it says something about the plants being important for something or 
other.” Furthermore, a pilot study suggested that without a caption this graph was virtually 
meaningless for all members of a group constituted of graduate students of education and 
mathematics education professors. 

Semiotically-Informed Phenomenological Hermeneutic of Graphs 

In research on graphs and graphing from a sociocultural perspective, we have evolved a 
semiotically-informed phenomenological hermeneutic to frame, describe, and explain the 
process of interpretation (Roth, 1996; Roth & Bowen, in press; Roth, Masciotra, & Bowen, 

1998; Roth & McGinn, 1997, 1998). A phenomenological hermeneutic undertakes to rebuild, 
from the beginning, the conditions necessary for the understanding of graphs as cultural units 
which semiotics — and traditional cognitive science (e.g., Larkin & Simon, 1987; Tabachneck- 
Schijf, Leonardo, & Simon, 1997) — accepts as data because communication functions on the 
basis of them. Such an approach is necessary because our research showed that graphs, even for 
practicing scientists, are often highly ambiguous “things” that have to be constructed as a 
signifying object with particular features before or as part of constructing possible referents to 
which the sign refers. Phenomenology therefore refers perception back to a stage where signs are 
no longer confronted as explicit messages but as extremely ambiguous texts akin to aesthetic or 
biblical ones (Eco, 1976; Ricoeur, 1991). 



[Insert Figure 7 about here] 



We view the interpretation of graphs as a dual, not necessarily sequential process which (a) 
establishes the graph as a sign which (b) stands for some phenomenon in the world (its referent). 
In the first process, the graph as a sign to be constructed is an object in the world which itself has 
to be structured (Figure 7; top left). That is, the graph is a referent for the structuring processes 
that establishes its nature as sign and its specific feature. The result of the second process is a 
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phenomenon in the world that stands as a referent to the graph as sign (Figure 7; bottom right). 
Our work shows that during graph interpretation, the two processes are interwoven such that 
both graph as object and graph as sign are concurrently constructed in a cyclic and mutually 
constituent fashion (Roth, 1998; Roth, Masciotra, & Bowen, 1998). Interpretants are of a 
different nature and can be: an equivalent sign vehicle in another semiotic system, (drawing of 
mountain) synonym, translation into another language (“Berg”), emotive or metaphoric 
association (mountain = purity), a scientific or naive definition in the same semiotic system 
(mountain = natural elevation with steep sides), or an iconic representation of a mountain (e.g., 
Fuji). The work of sign-interpretant relation is to elaborate the sign-referent relation. In this 
section we illustrate different levels of reading by (a) a professor who knows the type of research 
and the graph intimately well, (b) other scientists who know the type of research, (c) (preservice) 
teachers who mainly construct the graph as a signifying object and engage in literal readings, and 
(d) two scientists who discarded the graph as meaningless. 

Readings bv Scientists 

We can see the effect of these different types 
of metabolisms on distributions of plants 

Distribution graphs are relatively familiar to most scientists. Yet reading a graph is not a 
straight-forward activity, and it depends on the level of familiarity with the referents of axis 
labels and objects identified in the graph, that is, with the dimensions that span and constitute the 
new (virtual) space, and on familiarity with the research methods that lead to such graphs. 

For the ecology professor (Sen) familiar with the research from which the graph was taken, 
the graph was actually transparent such that he hardly referred to it at all, but talked about its 
“meaning,” that is, the ecological discourse into which it inscribes itself. 

We can see the effect of these different types of metabolisms on distributions of plants. 

Here we have a moisture and elevation gradient and a transect which is actually an 
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elevation gradient but here elevation is closely associated with moisture and temperature. 
The low land, it’s more or less desert, it’s very hot and dry. You get higher up in the 
mountains and it’s cooler and wetter. [Sen] 

Here, Sen does not begin with a reading of the graph, but prefaces his description by an overall 
statement about the purpose of the graph. These are the kind of readings we often get when 
individuals thoroughly familiar with the particular topic “read” the graphs or diagrams that they 
are thoroughly familiar with. However, our research also shows that the same scientists who are 
not thoroughly familiar with a topic have to expend (sometimes tremendous) efforts to construct 
the meaning of a graph or, as we show below, abandon all interpretation before integrating this 
graph into their familiar discourses. 

Sen, who has been teaching an introductory ecology course for several years, provided us 
with a “literal” reading of a particular aspect of the graph, the position of the distribution 
maxima. 

This [graphic] is just showing you the distribution of numbers of plants that have 
different types of metabolisms. Where it is coolest and least dry [C3 max ], relatively more 
C3 plants. Where it is sort of intermediate here [abscissa C4 max ], and intermediate 
temperature, intermediate dryness can have relatively more [C4 max ] C4 plants. And 
where it is extremely hot and dry [abscissa CAM max ], because this is South Texas after 
all, we have relatively more [CAM max ] CAM plants. So these metabolic differences 
happen to have strong effects on distribution of the plants. [Sen] 

This type of graph is frequently used in introductory ecology courses. For example, resource 
utilization along some niche parameter and specialization (adaptation) which expresses itself as 
population density variations along the adaptation parameter are commonly found types of 
distribution graphs (e.g., Ricklefs, 1990, p. 732, 752). In fact, Ricklefs (1990) shows several 
distributions of flora species along moisture gradients: deciduous trees along a moisture gradient 
in Wisconsin using “average importance values” as the ordinate dimension (p. 687); Oregon and 
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Arizona with a biomass measure, stems per hectare, as indicator for “importance” (p. 666); and 
hypothetical graphs distinguishing open and closed communities along am environmental 
gradient which is, in the text, exemplified in terms of a moisture gradient (p. 659). In the original 
article from which the Plant Distribution task was drawn (Eickmeier, 1978), photosynthetic 
pathways as the major means by which ecological resource (niche) division occurs in plants 
along a moisture gradient was the key point in its interpretation. 

Thus, without much work of reading the details of the text (graph, caption), Sen provided us 
with a reading of the significance of this graph in the domain of his research and teaching. Rather 
than some cognitive aspect that distinguishes him from the other scientists, his greater familiarity 
with this kind of graph, and this graph in particular, is a more reasonable and simpler 
explanation. Our conjecture, about the importance of habitual engagement in graph interpretation 
in settings where it is common practice to engage in such domain-specific activities, gains 
increasing importance as we provide our analyses of the other practicing scientists, and in 
particular those who found the Distribution Graph meaningless and difficult to interpret. 

Reading the Distribution Graph 

For scientists who were less (or not) familiar with the topic (photosynthesis), domain 
(botany), or research methodology (transects), interpreting the graph was a more protracted 
effort. 

Here [abscissa] is your elevation. So you’re taking some kind of a transect that goes up 
the mountains. And down in the valley you have warm dry climate and as you go higher 
you’re getting cooler temperatures and a little of precipitation and cloud formation and as 
a result something is zoning itself out. [Sid] 

In the first structuring move the abscissa is made salient. At this point, the graph is characterized 
by an abscissa with the particular feature of having elevation as a referent. In his next move, Sid 
found a referent in the world (of his experience), the abscissa standing for a worldly situation 
where making transects by sampling along some geographic parameter is done. Sid drew directly 
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on his experience, which includes collecting samples in oceans, of some phenomenon distributed 
both horizontally (geographical distribution) and vertically (depth distribution). Furthermore, as 
he described the transect moving up the mountain or down into the valley, he also associated 
these with the experience of changing “climates” associated with such moves. 

In the same way, Sam invoked the changing climes and fauna that can be directly 
experienced on the West Coast, or on any trip into the Rocky Mountains, Alps, or other 
mountain ranges. Ira (an “Instructor”) talked about a trip across Mount Kenya from semi-arid 
plains on one side through plantations of coffee into the cool mist, and Sid articulated a zonation 
of those things to which the distributions refer. In Vancouver (Canada), for example, both 
zonation and climate differences are visible during many parts of the year when there are barren, 
snow covered peaks on the mountains and blooming, even exotic plants in the low lands. Here, 
the salience of elevation was used to construct vivid images and descriptions of natural settings. 
Scientists make sense, that is, link the representation with their other understandings and 
experiences ( “so it, just as you go up it gets colder and wetter, that makes sense” [Sally]). In this 
case, “sense” to Sally meant that there is a preservation of the structural properties of the graph 
in which the graph can be read as indicating higher = wetter and cooler — which is consistent 
with her experience. 

So, it’s showing some kind of zonation that whatever C3 is it likes, it dominates at these 
higher altitudes, C3, it’s kind of, it’s a minimum and it actually picks up down at the 
bottom. So, there’s a bi-modal distribution in this. That’s CAM that may have a zone in 
the middle, on the upslope where it reaches a maximum and doesn’t grow anywhere else, 
or doesn’t live anywhere else. And C4 is a weakly bi-modal, it has a peak there [left] and 
a major peak [C4 max ] there so you’d find C4 dominating, well not dominating because 
C3 is dominating but it’s relatively high at mid elevations. [Sid] 

Here, Sid constructs the graph as an object which can refer to something else. He is concerned 
only with the particular features, the location of the various peaks and valleys with respect to the 
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elevation, and with respect to the relative frequency. As to the latter, he distinguishes between 
“dominating” and being “relatively high,” in which the C4 peak is contextualized and therefore 
relativized in two different ways. In the first, “it is not dominating” sets the C4 peak in relation to 
the C3 graph at the same abscissa location; in the second, the C4 peak is read relative to the other 
points on the C4 graph. The ecology professor (Sen) who was familiar with the research that had 
led to the graph never entered this stage of the interpretation. 

OK, so they “predominate in the hottest, driest environment” but why they drop off at the 
hot, why they didn’t go up there, that’s a question I have about it. [Sid] 

Here, the opposite to making “sense” occurs. At this point, the graph appears to be inconsistent 
with the caption text which indicates “predominates in the hottest driest environment” whereas 
the graph shows a drop in the relative importance. Sid appeared to say, “I cannot make sense of 
this feature of the graph,” that is, he could not integrate it into what he already knew or what the 
other parts of the text (graph, caption) told him. Testing consistency did not only move from 
graph to experience or inference, but also the other way around. Sally first suggested, “I assume 
these things [C3, C4, CAM] don’t all live at the same level?” but then rejected that assumption 
as she inspected the graph which showed values unequal to zero for each of the graphs (“these 
guys are actually all existing at each of these elevations. It must be, obviously”). 

I guess CAM are succulents. . . they are obviously very good at holding moisture. I mean, 
plants that live in hot drier areas tend to be very good at it, they’ve got waxy coatings on 
their leaves and they tend to be very good at not losing moisture when they are 
exchanging gas. Oh, so these guys actually have a nocturnal gas exchange for water 
preservation, oh cool, okay. [Sally] 

Although Sally did not know about the photosynthetic mechanism and how it operated in the 
course of the day, she constructed from what she knew (waxy coatings) and what she read in the 
caption (nocturnal gas exchange) to construct a story that made sense to her. 
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Sid concluded with a statement about the adaptation of the plants which allow them to 
compete relative to other plants, or to succeed in particular climes, “So each one of these plants 
has adapted some strategy to succeed over other plants or succeed in a particular temperature and 
moisture domain” (Sid). Similarly, Sally also concludes discussing the plants adaptiveness to the 
climate in which they’re found. 

I don’t know what kind of plants these guys would be, I would presume they possess 
some, I guess, it’s not clear to what, what sort of adaptation these guys [C4] would have, 
but these guys [C3 max ] are probably adapted, what’s 2000 meters, that’s fairly high, so 
they’re probably adapted so much to that higher elevation, certainly accustomed to a lot 
more moisture. [Sally] 

The scientists ended with explicit statements about the adaptation. These statements arose from 
the scientists’ attempts to explain the contrast between the three curves associated with the three 
photosynthetic mechanisms. One scientist (physicist) made direct links to C3 plants as possibly 
being grasses or conifers, or other Alpine flowers, the kind of plants he knew from experience 
grow at high elevations. 

The “discrepancy” that C3 and C4 plants increased in relative importance at very low 
elevation levels was not necessarily a salient element in scientists’ interpretation. Some scientists 
and the instructor noted them but did not address them at all (Sid). For example, Sam suggested 
that, possibly, the gradients of moisture and temperature indicated at the top of the graph may not 
hold at the lowest elevations or that a lake or ground water levels provided the moisture to which 
C3 and C4 plants were adapted therefore displacing the CAM plants. 

If you see relative abundance, then you add it up probably to something like a 100. These 
are not independent, the three curves. Like you don’t have, you don’t have something like 
density that’s plotted. Then, but if you get a peak here, you necessarily get depths in the 
other. [Sam] 
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Sally, too, noted that the three graphs were not independent and suggested that the graph was not 
a good plot and that “it would be a lot better to plot a straight out density or biomass or 
something like that, or just whatever, straight numbers, whatever you wanted to represent.” 

Analysis 

The scientists who read the graphical representation in this way did three dimensions of 
reading work. First, they read the lines in terms of their past experiences relating to a changing 
fauna with elevation and associated climatic changes. At the same time, they locate the three 
distributions with respect to each other. Finally, they attempted to explain the location of the 
three maxima in respect to each other by drawing on the concept of adaptation of plants to the 
physical environments. Their analytic work carved the reality of the graph such that each of the 
three relations told a story about the relative frequency of a type of plant (even if they did not 
know what type of plant it might be). The other dimension of their work was the relation of the 
text to some state in the world. First, this state is about the relationship between the frequency of 
one type of plant with changing elevation (or climate). In the second instance, the state has to do 
with the existence of ecological niches. 

Instructors & Preservice Teachers 

We 're just trying to determine what was the purpose 
of this graph beside showing distribution 

The interpretation sessions of three instructors and four pairs of preservice elementary 
science teachers were characterized by their predominant focus on the nature of the graph (and 
almost non-existant discussion of referents in the world). Their readings were largely literal 
rather than being concerned with the implications of the contrast raised by the three graphs. One 
preservice elementary teacher’s comment at the end of their interpretation, “Because all this does 
is tell us where these 3 points are” (Etta) in a way summarizes what these individuals and groups 
concluded about the distribution graph. The fifth group of preservice elementary science teachers 



O 

ERJC 



43 



Covariation and Graphical Representation 43 



differed somewhat because, in a manner similar to that of scientists, they attempted to link the 
distribution graphs to their experiences in the desert, on mountains, and in different parts of the 
US. A secondary teacher summarized his analysis in a similar way: 

All I can say here is relative importance whatever that’s supposed to mean here for the 
C4 type plant, it never gets all that high at whatever elevation, the highest it gets to 
whatever 30 something that’s supposed to be, at around 1, 2, 3, 4, 5, about 1400 meters, 
the CAM one varies greatly, it’s much lower at the higher, jumps up to its maximum at 
around 800. So, I can give you some numbers,. I was not entirely sure what they mean by 
relative importance. (Ian) 

As a result, the participants did not feel particularly successful at the completion of the task. Eva 
suggested “If I came across this in a textbook, I would likely just skirt right by” and Eldon 
commented “Relative importance, I didn’t get that part” 

The following episode was recorded five minutes into Erica and Eliza’s 27-minute session 
with this graph. At this point, they attempt to establish the relevance of some basic graphical 
features such as the additional abscissa above the graphs: 

Erica: But look, it was relative importance, 40, 80 does it say anything about that? 

Eliza: This is just a XY graph, not XYZ or anything, it seems strange that there’s three [top 
abscissa], I don’t know about X 

Erica: Like, you know what I mean, there is this [abscissa] and this [ordinate], you know, and 
there is also that [top abscissa] 

Eliza: Yeah, but this [top abscissa] is just READS [caption] desert and semi-desert, but it seems. 
Erica: So we can even ignore that [top abscissa]? I mean, and just go according to this [ordinate] 
[abscissa]? But the thing is. ( 1 0 s) 

Eliza: Relative importance. Well it’s a distribution along a moisture and temperature gradient 
due to differences in elevation. So this [upper abscissa] corresponds with elevation, it’s 
hottest, right? It’s hottest and driest at 500 [500] and as you get to an elevation 
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Erica: OK, so that’s what it is all about? 

Eliza: At 2000 meters it becomes coolest and least dry because they say C3 predominate 

[C3 max ] at the cooler, least dry end [upper abscissa, C3 max ], that’s what they’re telling 
us. 

Eliza and Erica attempted to integrate (“make sense of’) the graphical representation — 
particularly the secondary abscissa indicating climatic gradients associated with the elevation 
gradient — with the discourses about x-y and x-y-z graphs with which they were somewhat more 
familiar. For example, in their first reading, they treated the correlative abscissa as a different, 
third dimension as it would appear in an x-y-z graph. Erica asked whether they could ignore the 
secondary abscissa and interpret the graph as a relation between elevation and relative 
importance. However, having re-read the caption, Eliza pointed out that the upper abscissa is 
simply correlative to the lower one and made explicit links between the elevation scale and the 
temperature-moisture scales (e.g., 500 m hottest/driest, 2000 meters coolest/least dry). What is 
remarkable about the episode is that neither Erica nor Eliza attempted to link their statements 
about the relations in the text (graph, caption) to their personal experience or other referents in 
the world that might have helped them to make sense (establish structural equivalence) and 
therefore increase their understanding both of the graph and the world. 

Erin: READS[“C4 plants are maximally important under intermediate temperature and 

moisture conditions.”] So, intermediate and moisture, like what I would say is that the 
hottest and driest is going to be at that elevation? 

Etta: Well, it is because, oh well just because it says that, it doesn’t mean that this is the 

hottest and driest that’s possible on planet earth (Erin: No) It just happens to be that this 
is hottest and driest compared to this over here, so it appears on this section right here. 

Erin and Etta struggle for meaning at every step of their analysis, that is, try finding in their own 
experience discourses that would help them elaborate the text (graph, co-text) in front of them. In 
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the following episode, Eldon and Eva attempt to relate the elevation (lower abscissa) and the 
climate variables (upper abscissa): 

Eldon: Least dry. So that must, yeah, so least dry over here [C3 max ], so it’s cool there 

Eva: I think that this is what is confusing. Like I get that [C3 max ] this highest point 

[CAM max ] just when [C4 max ] it’s best at photosynthesis and then this [left, upper 
abscissa] is hottest, driest, coolest [right, upper abscissa], least dry. Then this elevation 
[500 m], like in my thinking lower elevation would mean, I guess not, I was thinking 
cooler, colder, and then higher [2000 m] elevation would be warmer. 

Here, even relating the two gradients in the context of the caption appeared troublesome. Eva, 
who had read the caption which indicated that C3 plants do best in a cool wet climate, had 
trouble with her association of coolest and least dry with lower elevation. Each aspect that they 
identified could therefore not be taken as granted but had to be integrated with the other pieces of 
the graph (cum caption). In the same way, Ema struggled with connecting the ordinate construct 
(relative importance) and its scale to something she was familiar with: 

But here, I mean, how do you connect this thing at 40 and 80, do you see this as a 
percent, or do you see, what do you see? These 40 mean, can it mean something? You 
know what I mean, like without just looking at these ‘cause these correlate right, these 
500 mean hottest and like that but here. [Ema] 

In the process, she attributes “hottest 5 ’ to an elevation measure (500), rather than constructing the 
relationship as an association. In part, these student groups struggled with what appeared to them 
to be arbitrary associations which, in the readings of scientists, were immediately meaningful 
through the association with their personal experience in relevant environmental settings. Thus, 
although the preservice elementary science teachers found themselves in the same situation as 
many of the scientists (i.e., not knowing about this aspect of ecology) the preservice elementary 
science teachers appeared to struggle with each element, the meaning of each process (CAM, C3, 
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C4), whether these labels stood for individual plants, types of plants, or processes, what the 
referent of “relative importance” might be, and so forth. 

Another reason that difficulties arose in interpretation of graphs was because of the 
interpretation of words in non-canonical ways. “What I was thinking is that, the importance of 
the C3 being at coolest, least dry at 2000, it’s very important that it would occur there, you know 
what I mean?” (Ema). “I’m not understanding what relative importance means (pause)? I guess, 
important to the environment, or what?” (Erin) “This photosynthesis process which occurs in this 
C3 plant is not so important, there might be other processes which occur,” and “the importance 
of it occurring at this elevation at this dry.” The difficulty with interpretation of what the ordinal 
axis label “relative importance” meant contributed to difficulties encountered interpreting the 
graph. 

Although the conversations among the preservice elementary science teachers and science 
instructors generally did not elaborate on worldly referents, we already described in the previous 
section that Ira began his session by referring to his experience of hiking up and down Mount 
Kenya, and the changing climates and flora associated with this trip. There was also one group 
which, to a much lesser extent, attempted to link the graphical representation to places they had 
already visited. Also in the following quote is an example of participants reaching conclusions 
which, although not necessarily incorrect, were not relevant to the biology of the plants in the 
problem at hand. Here, Eli linked “thinner air / atmosphere” with lower moisture levels, a fact 
that contradicted what he had read from the graph and caption. 

Eli: I think of elevation as well, you go up higher there is less, there’s thinner air, thinner 

atmosphere means generally less moisture because there’s less air, so there would be 
less water in the air. It’s harder to breath the higher you go up, because there’s thinner 
air, it would mean less (Ella: Less moisture?) Yeah it’s true, there is snow up there, but 
it is pretty frozen, I don’t know if that counts as moisture, I don’t know, it might. I 
guess it precipitates, it’s snow, but it’s snow moisture. 
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Ella: Well, the further north you go in Canada, the climate is generally quite dry so I guess, 

in those terms, but the elevation is not that high. 

Eli: But this is Texas too, which is just generally warm. I’ve not gone, I’ve been to 

California, I haven’t been to Texas, I’ve been to the central area of the States, yeah, or 
the eastern part of it. 

Before that, Ella had already talked about a visit to the Californian desert and her experience of 
the large temperature variations and the relative dryness; she furthermore noted that cacti had to 
be well adapted to such a climate. But neither she nor her partner generalized the adaptation 
argument to the other types of plants described in the caption and to the graph as a whole. 

Although the discourse in these groups generally stayed within the context provided by the 
graph/caption, there were three brief instances in which comments addressed issues of adaptation 
and survival. Yet, every time, the significance of these issues was never pursued or became 
salient; they generally appeared as passing comments and remained unelaborated and little 
connected to the interpretive task. 

But the thing is if you look at it, none of them are at the same elevation, so they all 
predominate too at different elevations which allows better survival as well. Because 
they’re not fighting for area and they’re not fighting for the resources in that area. [Erin] 

It just looks like you have three different plants, each photosynthesis method makes it 
more suitable for different environments, so as you could, go through the gradient, you 
get different plant populations predominating. [Ed] 

These comments were the only ones in the entire sequence related to the Distribution Graph in 
1 596- and 749-word sessions, respectively. These were not culminating comments that 
summarized the activities, but were strewn in the middle of their talk about the nature of the 
graphs. 




48 



Covariation and Graphical Representation 48 



Other Scientists’ Interpretations 

I knew that it wasn ’t very meaningful, it was just trying 
to show visual patterns that were detached from reality 

Two of the scientists refused to engage in a school-like activity and critiqued the graph from 
the perspective of their own work. In both cases though, they provided nearly transparent 
readings of their own graphs which had appeared in research journals and reports (Roth, 
Masciotra, & Bowen, 1998). Both provided us information at the end of the interview or sent us 
information afterward on how to prepare better graphs than the ones we had used in the research 
with them. 

I knew that it wasn’t very meaningful, it was just trying to show visual patterns that were 
detached from reality. But when I see this sort of thing here, it’s important to me to 
understand what the scales are so I can read. Because this is in theory something that is 
very real, so they draw a lot of these [abscissa] scales. I look at this and I can sort of 
forgive it because there’s absolutely no information at all about, you need sort of far more 
explanation. [Sandy] 

It is evident that their difficulties have to be taken seriously. Both were successful in their 
professional domains and had a considerable numbers of publications. In their explanations of 
graphs which they had constructed for publication purposes, the graphs were actually 
transparent, individual features which offered them occasions to develop thick descriptions rich 
in detail about the contexts in which the data were collected. Yet with the unfamiliar graphs they 
struggled considerably and abandoned efforts of making sense. 1 The following excerpt in which 
Soren (M.Sc., forestry) wrestles with the “meaningless” acronyms C3, C4, and CAM illustrates 
this struggle. 

1 During our interviews, there were three other graphs in addition to the Plant Distribution graph (Roth, Masciotra, 

& Bowen, 1998): birth and death rates as function of population size; graphical representations of essential, 
substitutable, and complementary resources in the form of isographs; isobologram representing the effect of two 
resources for plant growth. 
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I mean, because you’ve got, you’re talking about relative importance, you’ve got 3 
different species here or whatever. Are you talking about the relative importance of this 
one [C4] to these other ones [C3, CAM]? Or to some other external influences? I don’t 
know. [Soren] 

Sandy (Ph.D., marine biology) similarly attempted to come to grips with the notion of “relative 
importance.” Both scientists struggled with the fact that they neither understood nor could relate 
to the ordinate label, “relative importance” and that they did not know what the acronyms C3 and 
C4 stood for. Although Sandy realized that the graphs were to tell a “bigger message,” he, as 
Soren, did not provide readings in the way the other scientists did. 

Discussion of Plant Distribution Graph 

On this task, the split between scientists and (preservice) teachers was not as clear as on the 
Lost Field Notebook. Two scientists found the graphs meaningless and did not provide an 
interpretation. Most of the preservice and inservice teachers engaged in the construction of the 
graph as a sign (Figure 7, upper left) and in reading the individual curves literally, that is, as a 
change in the distribution of a particular plant type. They predominantly read the graphs in their 
relation to the abscissa (i.e., noting that there are changes of the relative importance with 
elevation) This is what one would have expected from the Lost Field Notebook. One of the 
instructors and eight scientists contrasted the different curves and therefore constructed a 
phenomenon that was not literally available in the graphical representation: the differential 
adaptation of plants with different photosynthetic mechanisms to climate. In the process of 
reading, the scientists drew on past experiences related to the research method that yields data as 
those presented and on their familiarity with questions of adaptation to construct their reading of 
this graph. The phenomenon emerged from the mutually constitutive reading of graph and 
constructing the elements of the graph on the basis of familiar experiences. Finally, in the 
interpretation of the ecology professor who used this and similar graphs in his lecture, the graph 
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as representation became transparent. He talked about the phenomenon of adaptation and how it 
led to different distribution of plant types in varying climates. 

Thus, we can see differences where our (preservice) teachers focused on reading the lines 
and indicating what relationship they express. The scientists contrasted the lines and constructed 
a secondary text in which the relative position of the lines had to be explained. There was one 
student group and one instructor who actively used past experience (trip to California desert, 
Southeastern and Central USA; across Mount Kenya) to enhance their grasp of the Distribution 
graph, that is, to increase the linkage between this previously unknown text and other texts that 
they are familiar with from scientific and experiential domains of their Self, but it was 
uncommon. 

The difference was further observable in the scientists’ greater tendency to draw on personal 
experience (strip sampling, constructing transects, traveling up mountains) as an important 
aspect of making meaning, that is, constructing links between extant experience and 
understandings of the new graph. In this, they could test whether some relation they inferred 
from the graph “makes sense,” that is, is consistent with another aspect of their 
experience/understanding. 

FINDINGS III: DATA COLLECTION, TRANSFORMATION AND INTERPRETATION 

The previous two sections dealt with practices of interpretation which required a 
transformation of a data set (the LFN problem) or of data that had already been transformed (the 
Plant Distribution graph). In both cases, data and graph were ready-made and the tasks could 
therefore be critiqued as more school-like rather than authentic (Lave, 1992; Roth, 1996). In the 
conduct of science research there are several practices which precede these representations, their 
transformation, and their interpretation — question “asking,” operationalization of variables, and 
collection of data. In the past we have attributed that difficulty' in interpreting graphs to a lack of 
experience in conducting these initial stages of research by the interpreter. 
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The conduct of effective research has several critical features which must be attended to and 
if students in public schools are to be expected to be able to do this type of analysis it is not 
unreasonable to expect that their teachers can engage in these same practices themselves so that 
they can best scaffold the children into and through these activities. Scientific research proceeds 
from the asking of “do-able” questions (Fujimura, 1987). These questions must have variables 
“identified” which must then be operationalized so that data can be collected. Various 
representations/transformations of the data then occur and from these claims are made which, 
generally, refer to the original questions asked. 2 The following analysis structures and orders the 
questions, design, representations, transformations and claims made by secondary pre-service 
science teachers (with science degrees) in their research project reports, highlighting the 
approaches used in each area and points at where there are inconsistencies in the chain of 
argument between one section and the previous section(s). 

Analysis of Preservice Secondary Teachers’ Reports 
To aid our analysis and interpretation of the work done in the reports, we used the following 
analytic frame to examine and interpret how closely the reports of the Pre-Service teachers 
paralleled those of “typical” scientific reports. Generally, this frame evaluated competence in 
conducting and reporting research as this relates to the stages evident in the epistemological vee 
that the pre-service secondary teachers used. In our analysis, we evaluated the projects submitted 
using the following set of questions: (a) What is the nature of the questions? (Correlational, 
relational, causal); (b) Are the constructs and variables operationalized effectively?; (c) How are 
data represented (e.g., tables)?; (d) What data transformation techniques have been used (e.g., 
graphical inscriptions)?; (e) What interpretations of the data are made?; (f) Are consecutive steps 
in the inquiry (a through e) consistent with each other?, and (g) Do the interpretations address the 
focus questions? 

L Various authors have pointed out that the written outcomes of a scientific study may result in questions being 
presented as if they were the ones originally asked, although they developed post hoc as the study progressed, or 
even in the formal interpretative stage when the actual research was concluded. Regardless, a scientific study is 
generally written about in a manner which gives the appearance of internal consistency and coherency from the 
original framing of the “question” to the “claims” about that question. 
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A cursory examination of the preservice secondary teachers’ reports suggests that they 
contain the fundamental components of scientific research reports: questions, data tables, graphs, 
interpretations and claims/implications are generally all present — as one might expect them to be 
given that the epistemological vee provides prompts for these elements to be included. 

To examine these reports in greater detail we independently viewed the reports and coded 
them into the representation seen in Table 6. Each student report was summarized in the table by 
highlighting (i) what type of question was being asked (Column 1), (ii) what the variables were 
(Column 1), (iii) how the variables were operationalized (Column 2), (iv) how the data were 
represented (i.e., maps and tables; Column 3), (v) what transformations were used (Column 4) 
and, (vi) what claims were made (Column 5). As well, symbols were used on the table to indicate 
when variables weren’t measured in such a way that they could be compared, when 
transformations didn’t relate to the original questions, when inappropriate graphs were used, and 
when claims did not relate either to the data or to the original question. Our closer analysis 
revealed that in the details of the research work there were many instances of non-standard 
approaches to the research and inconsistencies in the analysis of the data (Table 6 details ninety 
such problems). Generally, there were: research questions unanswerable by the study design, 
constructs inappropriately operationalized, data reported and transformations (graphs) used 
inappropriately, and claims which frequently did not match research questions or the data 
reported. Table 6 was ordered so that reports with the scientifically most acceptable practices and 
interpretations of data were at the top and those with the fewest at the bottom. We first 
summarize the findings and then use a representative student report from the top, middle, and 
bottom third of the table which we elaborate in detail to examine the use of various scientific 
practices in the field projects and the internal consistency of the reports. 



[Insert Table 6 about here] 
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Structuring Research Questions - Design Issues 

When the pre-service secondary teachers first entered the research area (located in 
“undeveloped” mixed forest at the edge of the university property) there was considerable 
discussion in the student pairs about what “do-able” questions they were to investigate. As 
students continued to work on identifying the area in which they were going to conduct their 
research work, staking out boundaries, and drawing a map of the zone, they started formulating 
specific questions to address as they noticed more and more specific details of the zone and 
reflected about the equipment which had been made available to them. 3 

Many of the investigations were framed as “causal” investigations — of the twenty-four 
questions addressed by the students fourteen were causal. For example, some of the causal 
questions asked were, “How does the moisture level affect the distribution and height of 
horsetails in our investigative site?” (Table 6, Question 3.b.) and “Do the exhaust gases from the 
cars parking in Lot C directly effect concentration of field flowers in front of the lot?” (Table 6, 
Question 1 1 .a.). In the first case, two components of the question indicate that it is intended to 
address causal relationships. Firstly, asking “how” indicates causality and, secondly, assuming 
the directionality that it is the moisture which affects the horsetails, not the horsetails affecting 
the moisture level (as is the case in some plant species), indicates that the intent of the question is 
causal not correlational. The second example is also clearly causal in intent because of the 
directionality implicit in the question as field flowers could not affect the release of “exhaust 
gases from the cars” but the exhaust gases might affect the field flowers. Note that for both 
questions it is not that directionality/causality can not be demonstrable, but that the temporal 
structure of the activity (two field periods within 8 days) make a causal investigation unfeasible. 
Operationalization of Variables 

Being able to appropriately/defensibly operationalize variables in a study is a key step to 

being able to construct claims from the data collected. If variables are poorly operationalized, 

3 In this, their discourse was similar to that of the grade eight students we observed work on similar field-based 
science activities (Roth, 1996; Roth & Bowen, 1993, 1995). For the field-based activity in the present study, the 
preservice teachers had the same equipment available as that used by the grade 8 students in the earlier study. 
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then it is usually impossible to make claims that relate back to the original focus question(s). Of 
the student reports, several (seven of twelve) studies had problems with how they operationalized 
their variables and/or with replication or sampling (Table 6; second column). In part, this was the 
result of addressing questions which were difficult to operationalize involving, as they did, 
biological factors such as “competition,” “biodiversity,” or “productivity” that are ecologically 
quite complex (and abstract)and which often require long-term studies and data collection. 
However, problems with operationalization of variables also occurred in situations where these 
conditions were not present in such a way that they would interfere with effective 
operationalization. 

An example of effective operationalization of variables is found in the first study (Table 6; 
Questions 1 .a. & 1 .b.). In this study, to address the questions, “Do spittle bugs show host 
preferences for three dominant plants in the plot?” and “Is there a relationship between light 
intensity and the distribution of [plant species] in the plot?” the preservice secondary teachers: (i) 
identified locations of individual plants of the three species; (ii) counted the number of 
spittlebugs on “ten stalks of each plant at five randomly chosen sample sites”; (iii) graphed the 
average number of spittle bugs found (with error bars) for each type of plant; (iv) measured light 
intensity in a grid across the entire mapped area; (v) drew a pattern map with the light intensity 
indicated over which was laid the locations of the [plant species]; and (vi) made claims related to 
the original focus questions using the data set collected and depicted in (i) to (v). This sequence 
effectively operationalized the originally stated focus question. 

In conducting our analyses, we decided to highlight instances of problems with 
operationalization which were apart from those of causality being inappropriately addressed 
(indicated in the previous section). Two major types of problems with operationalization were 
highlighted: (i) the measured variable ineffectively reflected the conceptual intent of the initial 
question, and (ii) there was insufficient replication or an inappropriate sampling regime. 

One particular study will be used to illustrate both of these situations. This study (Table 6, 
Question 12. a. & 12.b.) addressed the questions, “How does the side of a fallen log affect it’s 
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biodiversity?” and “How do the burned portions and the recent and older exposure of new wood 
affect the snags biodiversity?” To address both questions “biodiversity” was operationalized as 
the “frequency/quantity” of different types of organisms — lichen, moss; small plant growth (non- 
lichen, moss); spiders; beetles, larvae; and insects. Counts for lichen and moss in different areas 
were indicated as whole numbers ranging from one to five; no indication was made as to whether 
this enumeration indicated patches of the plants or individual plants (which, given the setting, is 
highly unlikely) or how this related to patch size. In this study, a count of “2” insects in a section 
represents large (macroscopic) insects visible at the surface, not those beneath the surface of the 
soil, under plants, in logs, etc. In this case, insufficient operationalization and sampling meant 
that even correlational claims based on data as it was presented would be inappropriate. 

Representing Data 

Data was represented or depicted in the reports in two main ways: in “maps” (which were 
requested as part of the assignment) and in tables. All reports included a map representation and 
14 of the 24 focus questions had data summarized in a table. Several of these tables were 
structured in non-standard ways and did not aid in understanding any patterns that may have 
been interpretable in the data. For the reports that did not use a table, using a table would have 
aided interpretation. Indeed, for the questions that were being addressed, using a data table in the 
collection process might well have led to more effective data collection. Our ethnographic field 
work with ecologists highlighted the role that tables served in their work — less as a 
representation tool than as one which “reminded” them what data they needed to collect (Roth & 
Bowen, 1998). In the practice of field science, tables serve as a tool which organizes researchers’ 
thinking towards the focus questions and what data needs to be collected. Observations made of 
students in their lab activities during this course suggested that they viewed tables as a 
representation/presentation tool and not as an integral part of organizing the research before and 
as it was being collected. 
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The maps in reports occasionally served as a surrogate for tables by helping the report writers 
relate variables. In three of the reports the maps were not sketches of the landscape upon which 
data sampling sites were recorded but were instead grids onto which measurements, locations of 
plants, or counts were recorded. Three other maps were diagrammatic sketches detailing plant 
locations and physical locations onto which measured data (e.g., light levels, moisture levels) 
were inscribed. The remaining six reports contained maps which detailed plant and substrata 
distributions but which were not used to indicate any measured features from the focus 
questions. Thus, it was not possible to use them to examine relationships between the biotic and 
abiotic variables under study — in essence, these maps served as iconic representations, 

“pictures” of the site, but contributed little to the investigation of relations being conducted. 

Transforming Data: Using Graphical Inscriptions 

Use of graphical representations occurred in almost all of the reports (10 of the 12; one of 
those that did not may have been able to more effectively interpret their data if they had). 
However, there were many problems with how graphs were used in the reports to depict the 
collected data. One report used line graphs when bar graphs would have been more appropriate 
for the data, three reports used bar graphs when scatter plots were more appropriate, and one 
report used a one-dimensional bar graph when a 2x3 bar graph would have better illustrated the 
data set. In one case, further insight would have been gained if an X-Y-Z plot had been used. 

Even when scatter plots were used (five times), best-fit lines were drawn on only two of them, 
and in one case was placed incorrectly. An outlier, which might significantly affect 
interpretation, was noted in only one case, although in the previous example (of the trend line 
being misplaced) identifying an outlier would have affected interpretation of the graph. 

Apart from the broader concerns of appropriateness of representation, graphs were often 
labeled or structured in ways that confounded their interpretation by the readers and were often 
inadequately (or not) discussed in the text of the report. For instance, one graphical 
representation depicted “gradient” and “change in moisture” on its axes, but was not referred to 
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in the text of the report nor discussed in the “methods” section and was therefore lacking 
interpretive context. 

In several student projects labeling of the maps, tables, and graphs was such that they 
contributed little to understanding what these inscriptions were representing. As a result, readers 
had to spend considerable time trying to relate written “claims” to the various inscriptions in an 
attempt to understand how the claims were derived. This lies in contrast to typical writings of 
science in which there are clear cues and pointers between report text, captions, and labels which 
together help the reader constitute and construct for themselves the claims from the data which is 
being presented. Understanding derives from reflexively cycling back and forth between the text 
and the inscriptions relating those pointers to ones own experiences. As labels, titles, and text 
become impoverished so too, subsequently, does the understandings which readers derive from 
them. 

Four graphs were drawn which were unrelated to the questions being addressed, and in some 
instances there appeared no conceptual reason to construct some of the graphs (such as plotting a 
bar graph of averages of measures across a slope). In total, 6 of the 10 reports using graphs 4 had 
problems with how they used graphical representations to depict the collected data with a 
subsequent effect on the claims drawn from those representations (Table 6). 

Interpreting Research Data 

The conclusion of a scientific report attempts to draw conclusions about patterns in the data 
collected/represented and discusses data in the context of the original question(s). This is often 
followed by a discussion of implications of the data, any issues arising from the design of the 
study being reported on, and future questions which might be addressed. The epistemological 
vee prompted the preservice secondary teachers both to include graphs and interpretations of 
those inscriptions in the analysis of their data. In our constructive critique of the claims section 

4 Whether it was appropriate (in our view) to utilize a graphical depiction for a particular data set when one was not 
used was not a consideration in this critique. The total was derived from conceptual issues arising from graph 
usage — not part of this total was any critique of structural difficulties such as poorly labelled axes, poor titling, or 
non-discussion of the graph in the text. 
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of the reports we therefore focus on the interpretations of the data being reported on (both graphs 
and tables) and how these interpretations relate to the original question(s). Our analysis examines 
the graphical representations and analyses used to report on students’ own research work in 
contrast with analyses of the Lost Field Notebook and Plant Distribution graphs. 

Several of the reports had interpretations which clearly followed from the collected data and 
its representations and transformations. However, many other of the reports made claims 
unrelated to the original question(s) or which did not logically extend from the data 
collected/depicted. Of these, the latter is the most problematic and occurred in ten of the twelve 
reports (Table 6), in some reports with regards to one claim, in others with regard to all of the 
claims made. 5 Also, in five of the reports, claims were made which were not related to the 
original question. For example, one report concluded that “intraspecific and interspecific 
competition affects the growth, density, and distribution of plants” drawing this causal 
conclusion from a dataset lacking measures attributable to “competition” (a quite abstract 
ecological concept) or of “growth.” However, in only two cases was this done and the original 
question also not addressed (in two other cases no claims were made related to a question posed 
in the study at all). No statistics were calculated in any of the cases nor was mention made that 
they could be calculated for the correlational data. Overall, problems with the claims’ sections 
arose more frequently from claims made which did not extend from the collected data, a quite 
frequent problem, rather than from claims which did not address the original question. 

Detailed Analysis of Three Cases 

To gain further insight into the competencies of the pre-service secondary teachers holding 
science degrees in standard scientific investigations we conducted a micro-analysis of three of 
the reports. In this analysis we examined the structure of the questions, the recording and 

5 Errors were not “double counted” in our analysis. For instance, if “slope” was operationalized inappropriately for 
the scientific meaning of the term and indicated as such in column 2 of Table 6. the interpretation of slope data in 
the claims section was viewed as being “inconsistenf ’or “consistent” in relation to the data which was collected, not 
in relation to a correct operationalization of “slope.” 
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reporting strategies, and the final claims for internal consistency and the methodological 
approaches used using the analytic frame detailed above. Table 6 is ordered such that reports 
with the most canonical approaches to research and reporting are found at the top and those with 
the fewest at the bottom. For this micro-analysis we chose one report from near the top, one 
report from the middle, and one report from the bottom of Table 6. These reports ranged from 
one that asked correlative questions, operationalized variables effectively, represented the data 
effectively, and drew claims related to the original data to a report which addressed causal 
questions, inappropriately operationalized variables, used inappropriate representations (of the 
collected data) and drew inappropriate conclusions (from the collected data). 

Case # 1 (Table 6; Questions 2. a. & 2.b.) 

This report addressed the focus questions, “Is there a relationship between the maximum 
height of the horsetails and the density of the horsetails?” and “Is there a relationship between 
the maximum height of the horsetails and soil moisture?” To answer these questions, students 
staked out a 4 m by 5 m area with string in 1 -metre square sections and drew a detailed map of 
its plant biota. Moisture was determined by repeated measures in each 1 -metre quadrant to the 
“depth of the horsetails’ tap root.” Horsetails were “counted in each quadrant” and the height of 
the “horsetails in each quadrant was determined and recorded.” (Details of data presentation and 
interpretations are found in Figure 8.) 



[Insert Figure 8 about here] 



In our reading of the report we noted that both questions addressed correlations and were 
answerable in the physical and temporal context within which the students worked. The 
operationalization of the variables was consistent with the original questions (i.e., the questions 
dealt with horsetail height and density and soil moisture, and these were the data collected and 
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tabulated). Data were inspected and one set of plants excluded from analysis as it appeared to the 
participants that they were misclassified and would thus confound or mislead interpretation. 

This report used two Cartesian graphs (Figure 8b & 8c) which allowed answering the focus 
questions. What lacked in these graphical inscriptions, in comparison to the work of practicing 
scientists we observed during and after this fieldwork, were lines of best fit and statistical 
evaluations of the relationships. Conducting linear regression analysis would have revealed that 
the linear relationship between height moisture had a regression with r = .75, p < .008 and the 
linear relationship between height and density had a regression with r = .65,/? = .023. This 
analysis was not done (which is not unreasonable since statistical analysis was not requested in 
the assignment), however, if it had been it would have strengthened the final claims (Table 7) 
made in the report. The first claim pertains to the original question asked, yet the speculation as 
to the correlation between height and density appears is not the only one possible. The second 
claim is a reasonable inference because high transpiration from some plants can lower local 
moisture levels. 6 Though the report does not discuss these explicitly, some of the “additional 
questions” (e.g., “how soil type affects soil moisture”) allow the inference that the reports 
authors were considering these issues. Overall, this report showed a strong internal coherency, 
such as is found in formal scientific documents, from the original questions which were framed 
to the claims which were drawn from the data. 

Case #2 (Table 6; Questions 7. a. & 7.b.) 

This report addressed the focus questions: “How is moisture related to the slope of the hill?” 
and, “Is the clustering of the ferns related to the moisture in the soil?” To address them the 
students marked out a 4 m by 5 m plot and “drew a scale map of the plot including plant types 
and location” and then “divided plot into 20- lm^ sections.” They then “took moisture readings 
in every comer and the center of each section, at 4 cm depths.” “Slope” was operationalized by 

6 Another reasonable interpretation from a biological perspective would have been that favorable conditions allow 
both increased distribution/height and density which therefore covary without having to be the cause of one another. 
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measuring distance down the slope from the highest elevation of the marked out plot. (Details of 
data presentation and interpretations are found in Figure 9. Letters A - E represent cross-slope 
co-ordinates, numbers 1 - 4 represent down slope co-ordinates.) 



[Insert Figure 9 about here] 



The questions in this report are structured to examine the relationships between two variables 
are ineffectively operationalized and the data sets not juxtaposed so that these relations can be 
determined. “Moisture” was operationalized as ‘average soil moisture,’ slope as ‘distance down 
slope,’ and fem counts or density not determined or represented in ways which allowed 
comparison to the independent variable of moisture. Observations such as the locations of plant 
species were recorded in a scale map (represented in part by Figure 9, Data (a); No key was 
provided) which could have been used for comparison with the measurements of average soil 
moisture in each of the 20 quadrants (Figure 9, Data b). 7 One characteristic of traditional science 
practices found in this study was the replication of measures of moisture reported in Data b. 

Data was then transformed into the representations seen in Figure 9: Transformed Data ‘a’ & 
‘b.’ 8 The Transformed Data ‘a’ bar graph represents the average moisture in the five sample 
strips (each with 4-lm 2 plots) across the slope. This bar graph is neither discussed in the text of 
the report nor does it address any of the questions being addressed thus there seems little 
theoretical reason for its inclusion. The figure shown as Transformed Data “b” represents the 
average moisture “down” the slope of the plot. We asked two ecologists to examine the graphs 
and they noted that because of the “relationship” that was initially framed as a question they 
would have chosen a Cartesian x-y graph to represent the data. 

7 The data in Figure 9, Data c which, being unlabelled and unreferenced in the text, was at first not 
interpretable. After some work, we realized that Data c represented the average of the moisture readings obtained in 
each quadrant which were given as raw data in the table shown (in partial representation) in Data b. 

8 Neither of these representations are explicitly related to the “slope of the hill” focus question (bar graphs are not 
used to illustrate correlations; although such a relationship may be read into Transformed Data b), and neither are 
related to the second focus question. 
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To address the second question our ecologists said they would have plotted the values of all 
measurements of moisture, or at least the average from each of the 20 plots, in a scatterplot with 
fern counts in each quadrant. However, such a graph may not have been possible (from the data 
included in the report) because there was no measure of fern density at the field site other than 
that in the site map that was drawn. If this map were “to scale” (as the report indicates) and the 
“F” markings indicated individual fern plants (this was not explicated) then such a graph was 
possible to construct. Finally, the bar graphs, which depicted means of means did not include 
error bars to indicate the standard error although these were calculable from the data available. 

As indicated in Figure 9, there was little labeling of the figures and no captions were provided 
(which complicated our interpretive analysis of the inscriptions). 

The report’s first claim (Table 7) was stated as a “relation” (top, middle, bottom) as opposed 
to the correlational framing of the original question. A statistical analysis of the correlation from 
Data c, suggested to be a normative approach by our field ethnography work with ecologists, 
shows a correlation coefficient of r - .66, p = .0014; F tests for the four distance categories 
would yield F = 7.0 1 , p < .004)). Hence, the report understates the conclusion which can be 
drawn from the data which was collected. 

The report’s second claim, that “pattern of fem placement on the slope is not related to 
moisture content of the soil,” is difficult to interpret because, other than the visual clues which 
can be taken from the scale map regarding plant distribution, there are no numerical data to 
substantiate the claim. The ecologists we asked to view this report suggested that it be better to 
operationalize “fem placement” along the slope so that it is a quantity thereby allowing statistical 
estimates of Type I and Type II errors allowing the claims to be situated. Finally, the text of the 
interpretations/claims were unelaborated providing little guidance as to how the graphs and 
information in the report should be read. Our research on scientific journal articles (which most 
of these students would have encountered in their third and fourth year courses in science) found 
that graphical inscriptions were considerably elaborated with text both in the caption and in the 
“claims” and data sections of journal articles together mutually constituting the claims and 
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reducing ambiguity in the reading of the inscriptions. Elaboration of this sort does not occur in 
this report making the reading of the report and interpretation of the inscriptions much more 
difficult. 

Case # 3 (Table 6; Questions 11. a. & ll.b.) 

This report addressed the focus questions: “Do the exhaust gases from the cars parking in Lot 
C directly effect concentration (measure of productivity) of field flowers in front of the lot?” and 
“Do the exhaust gases form the cars parking in Lot C directly effect height (measure of growth) 
of field flowers in front of the lot?” To address these questions the students “selected a level area 
of flowering plants in parking lot C” and measured out a “4x5 m area” which was subdivided 
“into m^ sections.” Each quadrant “was examined for growth and productivity of flowering 
plants.” Productivity was operationalized by counting “each bud/bloom .... rather than the 
stem.” The “height of each flower was measured using a metre stick.” (Details of data 
presentation and interpretations are found in Figure 10.) 



Our reading of the report found several issues that were problematic. Both focus questions 
seek to establish causal relationships between a physical variable (presence of exhaust gases) and 
the biological variables “concentration” (used as a measure of productivity) and height (as a 
measure of growth). As written there are conceptual and definitional difficulties with the focus 
questions being addressed. First, presence of “exhaust gases” is an inference about the effect of 
the proximity of the parking lot (as stated). Such a connection between “exhaust gases” and 
proximity to the parking lot would typically be an inference drawn in a “claims section” if 
proximity to the parking lot was found to be significantly related to growth of the field flowers. 
Such a claim is also complicated because a busy road (alongside D quadrats) paralleled the 
parking lot (alongside A quadrats) on the opposite side of the study area. Thus, interpretation of 



[Insert Figure 10 about here] 
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the effects of the independent variable of “exhaust gases” is confounded because it is present on 
both sides of the research area. 

There were further problems in the students’ conceptualization of variables. Firstly, as 
generally used in biological research growth is not operationalized by examining height. More 
usually, growth means either changes in height over a period of time or the population density of 
a species of plant. As well, productivity is normally interpreted as a unit output per unit time and 
not “concentration” (more appropriately referred to as density) as was measured in this study. In 
addition, the “unit output” would normally refer to the number of plants, not the numbers of 
“bud/bloom” as were counted. What the report writers actually seek, and what their data allow 
them to make claim of, are relationships between distance to the parking lot and the biological 
variables of height and plant density. 

In the report data was inscribed in three ways: in a map, two tables, and two line graphs (with 
strip averages joined). The map depiction (e.g., Figure 10, Data a), isomorphic in its 
informational value, contained data points corresponding to the actual count of the number of 
lupines found. This data is then reproduced in a table which is rotated ninety degrees (not shown) 
having “averages” calculated for the number of lupines in 1-m wide transects (parallel to the 
parking lot) — although calculating these is of little utility given how the data is utilized in the 
graph (structured similarly to Figure 10, Data c) and their calculation may have even contributed 
to the points of average being joined rather than a trend line being drawn. The height data are 
similarly entered on a table (Figure 10, Data b) with a calculation of average height across the At 
D bands. However, this average distorts the data because it is an average of the average heights 
in each quadrat and therefore quadrats with low number of plants are overemphasized in 
averages for each band. The drawing of the line joining the averages further distorts any 
relationship because of the inclusion of the zero- value quadrats (A-l & C-l) which bias the 
average value downwards. In analyses such as these it would not be unusual for these zero-height 
averages to be considered outliers and be excluded from an analysis of average heights. 
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The participants transformed the data in a categorical fashion using a line graph with the 
average height in each band connected — a plot of average counts and heights per “quadrant” — 
not as a trend line in a scatterplot as the data would have allowed. If distance from the parking lot 
was the independent variable, as the study suggests, then the appropriate Cartesian graph would 
have included it as a variable. There was also a lack of variable names on the abscissa and 
brevity in titling/labeling which meant that, as with Case # 2, the representations were not 
embedded in a thick descriptive context which would help scaffold the reader into interpreting 
the data in the manner desired by the report writers. 

In the report, other than calculating the average height across each of the five plots there 
were no statistics calculated. Both ANOVA and correlational analyses 9 might have offered better 
insights into what patterns were apparent in the data, but neither were conducted. In addition, in 
the calculation of averages in the report no consideration was given to what our ecologists might 
conclude were data outliers — in both quadrant A and C, there are “zero” values for growth 
(based on no lupines) which may have warranted exclusion as “outliers” and which influence the 
interpretation. 

Claims are based on data which could have established a relationship between the distance 
from the parking lot and the height and number of plants. However, the interpretation that 
“smog” has this relation is not supported by the data given that “smog” is present on both sides 
of the study site because of the presence of a busy roadway opposite the parking lot. Therefore, 
the claim (Table 7) that “smog decreases . . . growth...” is unsubstantiated given the data that was 
collected. Claim 3 acknowledges that the presence of a busy street on the other side of the 
research site might have had a mediational effect on the productivity (i.e., “total # of 
buds/blooms”) but left undiscussed why this would affect the number of budsddooms but not the 
number of lupines (i.e., “growth”) present, which was at their highest number next to the 
roadway. Claim 4 further confuses interpretation by the reader because what seemed to be the 



9 In this instance we are unable to offer post-hoc analyses because only summarized data (averages), based on 
uneven cell size, were given in the report. 
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dependent variable under study (i.e., “smog”) is then implied to be something that the 
investigators wanted to control (as “control” was discussed in the class and mentioned in 
reports). 



Discussion of the Authentic Investigation Task 

Much like the middle school students in the initial phases of previous studies (Scardamalia & 
Bereiter, 1992; Roth & Bowen, 1993), and despite their previous (for most) science degree(s), 
these preservice teachers had difficulty constructing productive questions to direct their inquiries. 
The practices of our preservice teachers were also surprising in the light of the fact that 
practicing scientists ask “do-able” question, that is, questions that can be answered within the set 
of contingent constraints under which they have to work. Our present research suggests that their 
university courses have not assisted them in the development of a sense for distinguishing do- 
able from not do-able research questions (c.f., Bowen & Roth, 1998b). 

The majority of the focus questions investigated in these reports focused on causal and not 
correlational questions (Table 6). In the context of the activity in which they were engaged, 
investigating causal questions was generally not feasible being more commonly addressed either 
in experimental situations or over a considerable period of time — neither of which were possible 
in this activity. In our past work with Grade 8 students conducting similar (although long-term) 
field research activities outside we also noticed (unreported data) that their initial investigative 
questions were often causal. Further, of the 25 focus questions there was some difficulty with 
operationalization of one or both of the variables in 17 of the questions which would ultimately 
cause difficulties with claims made in the reports. 

These problems are similar to those found in the initial field work projects conducted by 
Grade 8 students. For instance, the focus questions addressed by the preservice secondary 
teachers had many similarities to those framed by the Grade 8 students when they first started 
their outdoor research — many were so conceptually “broad” that it would be difficult to address 
them in a single outdoor session. These questions addressed issues that were quite ecologically 
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complex, relationships such as “competition,” “biodiversity,” “growth,” and “productivity” all of 
which have specific meanings in biology that do not equate to “distribution,” counts of limited 
numbers of organisms, or “height” as they were used by the pre-service secondary teachers. 1 0 
This meant that even some of the questions which were stated as a correlation (e.g., Q4.b.) were 
conceptually actually causal questions because of the concepts involved in the question and how 
they would need to be operationalized to be addressed (e.g., competition and plant distribution; 
Q4.b.). 

In science it is common practice to record data in tables, for organizational, process, and 
presentation reasons, and then to transform the data into more abstract representations allowing 
for the examination of relationships between variables (e.g., Figure 1). Since these studies were 
to be an examination of measured relations and since the vee-map heuristic prompted for the use 
of graphs it was not unexpected that tables and then higher order transformations would be 
frequently used (14 and 15 respectively of 23) — a slightly higher use of graphs them was found 
for the Lost Field Notebook problem. However, as discussed earlier, there were structural 
problems with both the graphs and tables resulting in difficulties in interpreting them. This 
would then compound difficulties in interpretation given that in the Lost Field Notebook and 
Plant Distribution activities described earlier we documented pre-service teachers having 
interpretive difficulty even when contextual cues in the form of suitable labels and titles were 
provided. Also, although only 44% of the respondents drew graphs in the Lost Field Notebook 
study, those which were drawn were scatterplots — which allowed interpreters to examine 
patterns of relation between the variables. However, in the reports of the outdoor research project 
1 0 of the graphs which were drawn (to address 8 of the focus questions) were inappropriate for 
the data and question being addressed. Transforming data into a graph from a structured table is 
a normal step in the conduct of science and might partly explain why there were discrepancies in 
graph use in the projects compared to the interpretations of the Lost Field Notebook problem. In 

I® Investigations of causality in ecology often involve experimental designs (as opposed to just observational ones) 
and occur over considerable periods of time sufficient to address factors such as plant growth, long term growth, 
competition, etc. 
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the Lost Field Notebook problem numbers were clearly in pairs which, given that students leam 
about scatterplots being used for “pairs of data” from Grade 8 onwards, lent themselves to being 
depicted in a scatterplot. However, in the cases where graphs in the reports were used 
inappropriately, data was drawn from maps where “pairs of data,” such as in the Lost Field 
Notebook map, were much less obvious. How one structures the data in representing it — how the 
data tables are structured — appears to influence the graphical inscriptions which result. 

Similar to their work in the Lost Field Notebook activity, the pre-service secondary teachers 
rarely identified outliers and excluded them from analysis which, as shown in Case Study #3, 
could affect interpretation. Furthermore, lines of best fit were used in only two graphs (in one 
case seemingly opposite to the pattern, in the other drawn through averages and not raw data) 
which was a frequency even lower than that reported (26%) used when interpreting the Lost 
Field Notebook data. 

However, just as we argued that tables do not play the role for these students that they do for 
experienced researchers we conclude that transformed inscriptions also play a different role. In 
part, this claim arises because of the discontinuity that exists between the text of the claims and 
the inscriptions. Whereas for experienced researchers the claims and the inscriptions are 
mutually constitutive, in the majority of these reports claims did not derive from the inscriptions. 
When examining transcripts of intepretations of the Plant Distribution graph it was clear that 
scientists used the graphs as a point around which to discuss claims (such as is found in its 
caption) relating both to their experiences in the field. In the pre-service teacher reports on the 
Authentic Investigation task the claims only rarely related to the inscribed data and often did not 
relate to the original question either. This is unlike the coherence found in scientific reports 
which are written to show questions clearly leading to data which clearly lead to inscriptions 
which clearly link to the claims which address the original questions. Such a continuity was not 
present in many of the reports submitted by the pre-service secondary teachers. 

If these reports were based on field studies that were expected, by the instructor, to be in- 
depth, lengthy, and conceptually complex then the number of non-standard approaches and 
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interpretations present in the pre-service teachers’ reports might well be understandable. 
However, the assignment was little different than that done previously with grade eight students 
who were learning to conduct their own research projects as part of their regular science classes. 
In spite of the substantially greater education of the pre-service secondary science teachers, they 
exhibited no greater competency at structuring, conducting, and writing about this type of 
research activity than the Grade 8 students initially did. Similar difficulties, such as asking causal 
questions, inappropriately operationalizing variables, inappropriately using graphs, and 
constructing claims that extended beyond the data initially were not uncommon amongst the 
Grade 8 students but became almost negligible as they gained more experience in conducting and 
presenting their own research. The contrast in their competency at engaging in such tasks 
compared to the student teachers should have considerable implications for teacher education 
programs. 



DISCUSSION 

In this study we had participants engage in a number of different tasks which, together, were 
quite similar to the panoply of practices in which scientists engage as they conduct their 
everyday work. The tasks progressed from analyzing data (such as students would do in a 
biology course in which they were learning about the practices of science or scientists would do 
when reading scientific reports), to interpreting data which had been analyzed and transformed 
by others (such as students and scientists do when they read research papers written by others) to 
a project in which participants conducted, analyzed, interpreted and drew conclusions from 
research of their own design (such as scientists do in their everyday work). In these activities 
there were both similarities and differences between the practices of working scientists and those 
of both pre-service teachers and science instructors. However, in general, the teachers and 
instructors did not often engage in the same practices nor reach the same conclusions of those 
who were experienced in conducting, summarizing, and analyzing research (including both 
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working researchers and the Grade 8 students from our past work). It would seem that engaging 
in research projects of one’s own design (with all of the components of analyzing and drawing 
conclusions from this work) is an important component of learning to interpret the work (i.e., 
writings) of scientists in the ways in which they intend — and this was something which had not 
been done by the instructors or pre-service teachers. We now discuss some of those similarities 
and differences and the underlying concepts of significance. 

From World to Sign (Texf) and Back 

According to Latour ( 1 993), nature and its representations can be thought of as lying on an 
open continuum which, one side, is characterized by increasing levels of locality, particularity, 
materiality, multiplicity, and continuity and to the right, is characterized by increasing levels of 
compatibility, standardization, texts, calculation, and relative universality. What scientists (and 
others doing research) accomplish are transformations of ontologically different representations 
linked only by consensus on the process and products of transformations between different 
inscriptions. Our tasks can be mapped onto this continuum to show the different nature of each 
task (Figure 1). In the Lost Field Notebook, the task requires participants to transform the data 
into an inscription to the right, and then to reconstruct a nature setting in which the data might 
have been collected. In the Distribution Graph, the task was to read the graph as a story that had 
a referent in the world not only about the distribution of plants, but about the adaptation of plants 
to particular environments. Finally, in the Authentic Inquiry tasks, participants were asked to go 
full circle — rather than reconstructing environments from texts (graph/caption), they actually 
know the setting about which they were to make some general statement. Thus, despite the fact 
that the task involved more steps, it might have been easier given that they were making a 
statement about an ecozone with which they had personal experience. 

Although the (preservice) elementary teachers did not translate their data in the Lost Field 
Notebook task, and therefore had little to say about the overall relations between the two 
variables, they did reconstruct possible scenarios that could have led to the particular data they 
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had in front of them. On the other hand, the Distribution Graph led one group of preservice 
elementary science teachers and one instructor to make explicit links to their personal 
experiences related to changing fauna with changing climes and elevation; most activity 
remained referentially isolated in the context set by the sign structures (words, data, lines). 

Scientists had a singular set of practices for dealing with the Lost Field Notebook: plotted the 
data, proposed regression analysis to test goodness of fit, discussed an outlying data point, and 
suggested the collection of additional data to increase the power of the statistical analysis. Our 
past research with pre-service secondary teachers reported that they infrequently used graphs to 
address the Lost Field Notebook problem and in this study we found that “priming” them about 
the importance of using graphs to make correlative arguments resulted in an increase in graphs 
being used when addressing the Lost Field Notebook problem. However, even with this priming 
it was still only a minority of students that used scatterplots. We turned to the Authentic 
Investigation task to obtain further insight into why the priming was ineffective. From these 
reports we realized that the difficulties lay not in knowing if a graph should be used, but rather, 
was embedded in not knowing how to structure data and choose appropriate inscriptions to 
address problems. 



Epistemology 

Instructors and pre-service teachers acted as if a relationship had to be unambiguous, all data 
points consistently “in line” with each other. Here, the belief in a mathematical nature of the 
universe is inherent in the explanations — there did not seem to be another way. Thus, variation in 
one of the two variables with constant second variable, or a comparison of a negative 
relationship between two data pairs, was sufficient to reject a positive relationship between the 
two variables. 

Where might this default practice come from? Given students’ experience with science from 
science textbooks and lectures, and their mathematics experiences — where they likely would 
have been plotting functions — they would have seen predominantly, if not exclusively, line 
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graphs and “data points” that fell, in an ideal way, on the line. In these sources there is a didactic 
use of clean line graphs; and in the few cases that “data” were plotted, these fell exactly on the 
best-fit line on the graph 1 1 (Roth, McGinn, & Bowen, 1997). 

It has been noted that scientists believe in the isomorphism of nature and mathematics 
(Lynch, 1991); in many cases, and for a historically long period, scientists believed that the 
world is inherently mathematical such that mathematical structures not only describe but in fact 
are responsible for the patterns in the world. Our research shows that not only scientists appear to 
operate as if nature was inherently mathematical. Furthermore, the very practices of using 
graphical representations and the mathematics activities in which functions are plotted may be at 
the origin of such default, commonsense and mundane assumptions about the world. 

Significance for Educating Science Teachers 

Overall, although the preservice secondary teachers had undergraduate and even graduate 
degrees in science, they did not default to practices that scientists use routinely in their everyday 
work. This has implications for undergraduate science education and science teacher education. 
As we found in the previous study where we examined the responses of teachers to the Lost 
Field Notebook problem, the results of this study suggest that most preservice teachers do not 
seem to be ready to teach scientific practices of interpretation in the way advocated by 
curriculum reform. Of even more concern was their difficulty in conducting and summarizing an 
open-ended research project of their own. In scientific communities participants ask do-able 
questions and use graphing on a day-to-day basis as default approaches to participating in the 
domain (Fujimura, 1987). As many of our participants had science degrees, we might expect 
them to default to these practices. This was not the case, and is even less likely to occur when 
there is less scientific training as part of a teacher education program — as occurs in many U.S. 
universities and colleges. If teachers have difficulty asking “do-able” questions themselves, how 

1 1 Such a description also characterizes relationships between variables as they appear in newspapers and news 
magazines. 
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are they to scaffold students towards asking them so that they can effectively engage in activities 
which have a high degree of similarity to scientific inquiry? We suggest that not asking such 
questions oneself in the context of “authentic” field investigations indicates that there will not be 
the requisite recognition of appropriate or inappropriate questions asked by students that would 
be necessary to help students develop such skills. Simply telling preservice teachers which 
questions are appropriate or inappropriate outside of the context of their engagement in lab 
investigations will not increase their competence in helping students ask appropriate questions. 
That preservice teachers do not engage in these practices is not a critique of them individually, 
but rather a commentary on the efficacy of the experiences they have engaged in their 
undergraduate studies. 

Our research has considerable implications for the preparation of science teachers. At the 
present, our preservice teachers did not seem to be ready to competently teach inquiry and data 
analysis in the way suggested by recent reform documents (AAAS, 1993; NCTM, 1989). 
Representing is a central part of science (Latour, 1987) and being able to scaffold students into 
the appropriate use of graphs and tables in the context of addressing questions which are do-able 
is something that teachers need to be able to do to address the curriculum reform documents. 
Thus, despite the considerable amount of preparatory course work that these preservice teachers 
had taken in science, they were insufficiently prepared to teach in the way we would like them 
to. As with the telling of student teachers what “appropriate” questions are for investigation, we 
also do not think that simply telling preservice teachers which graphs or other tools of 
interpretation are appropriate will increase their competence in helping students learn canonical 
methods of data analysis and interpretation. We have argued elsewhere (Roth & Bowen, in press; 
Roth & McGinn, 1998) for changes in teaching science that would focus on graphing as social 
and cultural practices in which student teachers should become more engaged as part of their 
undergraduate science work. This should address what is clear from our work with preservice 
teachers and in undergraduate science classrooms — that they have little practical experience 
engaging in the mathematical practices of science. Structural change is needed in the 
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undergraduate experiences of preservice teachers if they are to fulfill the goals of the reform 
documents and have their own students engage in the daily scientific practices of asking do-able 
questions and making claims based on appropriate use of various inscriptions and 
representations. In our social practice framework, preservice teachers need to have more 
experience in using graphing to help construct rhetorical claims around investigations they have 
designed. This would seem to be the most effective way for them to become enculturated into the 
practices of science which they can then use as a foundation to enculturate their own students. 
However, as members of a community involved in preparing teachers to go into schools and 
teach children, we therefore have to question (a) whether the objectives in our reform documents 
are realistic given the current teaching practices in colleges and universities and (b) what kind of 
science experiences would prepare preservice teachers with undergraduate degrees in science in 
a better way for meeting the challenges posed by the visions of the reform documents. 
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CAPTIONS 

Figure 1. Relations of inscriptions between world and sign 

Figure 2. (a) Lost Field Notebook task, (b) Scatterplot of LFN data 

Figure 3. Plant Distributions graph and caption 

Figure 4. Non-linear scatterplot drawn by “Steve” for LFN task. “Steve” had axis reversed 
compared to all others who used a scatterplot to address the task. 

Figure 5. Solution to LFN task by (pre-service) secondary teacher. 

Figure 6. Solution to LFN task by (pre-service) secondary teacher who dealt with the data in two 
sets: (a) scatterplot of four locales for which a correlative relationship was claimed, (b) 
scatterplot of four locales for which a claim was made of no relationship. 

Figure 7. Semiotic model of reading graphs. The upper left hand side represents the process of 
perceptually individuating some element that has the potential of becoming a sign 
object. On the lower right hand side, signs are read as being about natural objects. 
Conventional constraints r on sign use, and contextual constituents c of individual sign 
elements mediate the reading of the graph. 

Figure 8. Scans of data & transformations from Case #1 ’s report. 

Figure 9. Scans of data & transformations from Case #2’s report. 

Figure 10. Scans of data & transformations from Case #3’s report. 

Table 1 . Participants and task distribution. 

Table 2. Strategies used and comparisons made by instructors. 

Table 3. Distribution of data transformations and types of claims by preservice secondary 
teachers. 

Table 4. Numerical strategies and comparisons made by preservice elementary teachers. 

Table 5. Comparative reasoning patterns and strategies deployed with individual data points. 

Table 6. Summarization of the research reports for field investigation task complete by 
preservice secondary science teachers. 

Table 7. Claims made in the reports of the three case studies. 
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Figure 1 . Relations of inscriptions between world and sign 
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Distribution of C3, C4, and CAM (succulent plants) in the desert and semi-desert vegetation of Big Bend National 
Park, Texas, along a moisture and temperature gradient due to differences in elevation. Cam plants with nocturnal 
gas exchange for water conservation predominate in the hottest, driest environment, C4 plants are maximally 
important under immediate temperature and moisture conditions, and C3 plants predominate at the cooler, least dry 
end of the gradient. (Modified data from Eickmeier, 1978) 



Figure 3. Plant Distributions graph and caption 



Covariation and Graphical Representation 




Figure 4. Non-linear scatterplot drawn by “Steve” for LFN task. “Steve” had axis reversed 
compared to all others who used a scatterplot to address the task. 
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The Lost Field Notebook 

1. Patterns seen: 

— “Tendency” for increase in foot candles -> increase in % coverage but not absolutely shown by figure above. 
— One major inconsistency: 30% coverage @ 500 f.c. but also 30% coverage @ 1500 fc. 

— Outer areas have greater % coverage, generally 

2. Claims: 

— Suggest different soil temperature, terrain types 
— Suggest different water supply 

— Shows plant is able to grow in lower lighting conditions 

3 . Support: 

— From graph of data 
— Must be factors other than light 

Figure 5. Solution to LFN task by (pre-service) secondary teacher. 
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a. 



b. 








3 tod 



Figure 6. Solution to LFN task by (pre-service) secondary teacher who dealt with the data in two 
sets: (a) scatterplot of four locales for which a correlative relationship was claimed, (b) 
scatterplot of four locales for which a claim was made of no relationship. 
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Figure 7. Semiotic model of reading graphs. The upper left hand side represents the process of perceptually 
individuating some element that has the potential of becoming a sign object. On the lower right hand side, signs are 
read as being about natural objects. Conventional constraints r on sign use, and contextual constituents c of 
individual sign elements mediate the reading of the graph. 
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Transformed Data: 




Figure 8. Scans of data & transformations from Case #l’s report. 
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Data: 




Transformed Data: 





Figure 9. Scans of data & transformations from Case #2’s report. 
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Figure 10. Scans of data & transformations from Case #3’s report. 
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Table 1 



Participants and task distribution 



Population 




Task 






Lost Field 
Notebook 


Plant Distribution 
Graph 


Authentic 

Investigation 


Research Scientists 


Think aloud (10) 


Think aloud (10) 




(7V=15) 

Science teachers with 


Think aloud 


Think aloud 




B.Sc. (N—4) 

Preservice Elem. Science 


Pairwise Protocol 


Pairwise Protocol 




Teachers (N= 10) 


(5 pairs) 


(5 pairs) 




Preservice Sec. Science 


Written 




Written (1 1 pairs. 


Teachers (7V=25) 


(individual) 




1 triad) 




S3 
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Table 2 



Strategies used and comparisons made bv instructors 







Strategies 




Total 


Comparisons 




Within 


Between 


Cross 




CD: 2 


Ike 


4 


3 




7 


CE: 1 


Ira 


1 




1 


2 


CF: 6 


Ian 


1 




2 


3 


DE: 1 


Ina 


3 


1 




4 


DH: 2 


Total 


8 


4 


3 


16 


DEC: 1 












DEH: 1 












HAD: 2 




O 



4 
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Table 3 



bv preservice secondary teachers (N= 27) 


Representation 


Relationship 




Yes 


No 


Plot + best fit (outlier) 


6 


1 


Plot only 


0 


4 




(.5) 


(.5) 


Table 


0 


5 


Verbal 


0 


8 


Other (cross section, ratio) 


0 


2 
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Table 4 

Numerical strategies and comparisons made bv preservice elementary teachers 



Strategies Total Comparisons 



Group 


Within 


Between 


Cross 


Qual 




BH: 


0(1) 










(Within) 




CD: 


0(2) 


A 


0 


6 


0 


2 


8 


CE: 


1(2) 


B 


1 


2 


1 


3 


7 


CF: 


5(D 


C 


0 


1 


0 


2 


3 


CH: 


0(1) 


D 


0 


0 


0 


5 


5 


DE: 


2(3) 


E 


1 


0 


1 


1 


3 


DG: 


2(1) 


Total 


2 


9 


2 


13 


26 


DH: 


3(1) 














GH: 


0(1) 



O 

ERIC 
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Table 5 



Comparativereasoningpattem deployed with individual data points 



Reasoning 


Data Points Deployed 


B 


Strategy 

W 

(qual) 


X 


Total 


high:high, low:low 


D:G, D:E 


3 


0 


0 


3 








(4) 




(4) 


different %: 


D:H, C:F, D:E:F, H:A:D 


9 


13 


2 


24 


same/similar fc 






(4) 




(4) 


same/similar %: 


C:E, B:H, C:D, G:H, D:C:E 


3 


3 


2 


8 


different fc 






(7) 




(7) 


increase: decrease 


D:E, C:H 


0 


2 


1 


3 


decrease : i ncrease 






(0) 




(0) 


Total 




15 


18 


5 


38 








(15) 




(15) 
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Table 6 












Representation of the research conducted by preservice secondary science teachers with science degrees 






Question 


Design (operationalization) 


Data Representation 


Transformation 




Claim 


l.a. REL 




MAP[plants] 


SUM[#bugs], AVG[#bugs] 




REL([preference], [species]) 


[host preference] 


[#bugs] || [species] 


TAB ([site#] [species] , 


WHISKER([AVG], [species]) 






[species] 




[frequency]) 








l.b.CORR 




TAB([row#][column#], 


MAP([intensity][isolines], 




+CORR([light intensity], 


[light intensity] 


[light intensity] || [location] [light intensity]) 


[location]) 




[distribution]) 


[distribution] 












2.a. CORR 


[height] || [fepecies/m^] 


MAP([rel. coverage], 








[height] [density] 


[location]) 
TAB([x-loc] [y-loc], 
[moisture] [height] 
[density]) 


SC ATTE R([height] , [density] ) 




+CORR([density], [height]) 






2.a. CORR 






AVG[moisture] 






[height] [moisture] 


[height] | [moisture] 


n 


SCATTER([height], [moisture]) 




-CORR([moisture], [height]) 


3.a. CORR 




MAP([moisture][plantloc] 




$ 


noCORR([slope], [moisture]) 


[moisture] [slope] 


[moisturel || [x-loc] [v-loc] 


[plant height], [x-loc][y- 
loc]) 


BAR([density],[ x-loc]) 


*♦ 


CAUS([competition], [growth] 






TRANS([species][location 

]) 

TAB([x-loc][y-loc], 

[moisture]) 






[density] [distribution]) 


3.b. CAUS 








♦ 


REL([# horsetails] 


[moisture]— > 


* [moisturel || [# horsetails, 


it 


SCATTER([height], [moisture]) 




[disturbance]) 


[distribution] 


x-locl 






♦ 


-CORR([ [d isturb ance] / 


[moisture]-- 

>[height] 


y n J 

[moisturel || [height] 








[moisture]], [height]) 




28 



Covariation and Graphical Representation 97 



4.a. CORR MAPflrel. coverage], 

[pH] [distribution] [pH] | [rel. coverage] [location]) 

TAB([species], [% 
coverage]) 

TAB([location], [pH]) 



<none> 



4.b. CORR 
[competition] 




finterfacel 1 frel. coveracel 


it 


<none> 


[distribution] 










5.a. CAUS 






MAP[plants] 


AVG[pH] 


[pH]~>[ vegetation] 


¥ 


[pH] | [species] 


TAB([trial#][species],[pH] 

) 


BAR([AVG], [species]) 


5.b. CAUS 








AVG[moisture] 


[moisture]-- 


¥ 


[moisture] || [species] 


TAB([trial#][species], 


BAR([AVG], [species]) 


>[vegetation] 






[moisture]) 




6.a. CAUS 






MAP([rel. 




[moisture]-- 


¥ 


[moisturel || [species] 


coverage][moisture], 


<none> 


>[growth] 






[location]) 




6.b. CAUS 
[overstory][mid- 
story]--> [growth] 


¥ 


[species] 


n 


<none> 


7.a. CORR 






* 


AVG(moisture) 


[moisture] [slope] 




[moisture] | [x-loc][v-loc] 


TAB([x-loc] [y-loc], *• 


B AR( [moisture] , [x-loc]) 








[moisture]) 


BAR([moisture], [y-loc]) 


7.b. CORR 
[distribution] 




[plants] | [moisture] 


MAP([plants], [x-loc][y- 


<none> 


[moisture] 






loc]) 





+ 

+ 



+ 



* 

*♦ 

*♦ 



* 



♦ 



* 




noCORR([pH] [distribution]) 



competition([speciesl] 

[species2]) 

competition([species2] 

[species3]) 



noREL([pH], [species]) 



CAUS([moisture], [species]) 



+CORR([species 1 ], [moisture]) 
+CORR([speciesl], [light]) 
+CORR([species 1 ], [gradient]) 



<none> 



-CORR([x-loc], [moisture]) 



noCORR ([species 1], 
[moisture]) 
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8.C0RR 




[rel. coverage] || [pH], 


MAP[relative coverage] 


4 


lineG[pH],[rel. coverage] 


4 


noCORR[pH], [rel. coverage] 


[growth] [pH] 




[moisture] 


TAB([rel. cover], 


4 


lineG[moisture],[rel. coverage] 


4 


noCORR[moisture], [rel. 


[moisture] 




[pH][moisture]) 








coverage] 


9.a. CAUS 








• 


AVG[moisture] 






[slope]- 


i 


[gradientllx-loc] II 


<none> 




SCATTER([x-loc],[AVG]) 


♦ 


+C0 RR([gradient] [moisture]) 


>[moisture] 




[moisturel 




4 


SCATTER([gradient], 




REL([gradient], [x-loc]) 












[Amoisture]) 














• 


BAR([x-loc],[% gradient]) 






9.b. CAUS 
[slope]- 


i 


[x-loc] |l T% SDeciesll 


MAP([rel. coverage], 


4* 


BAR([x-loc],[% speciesl]) 


4 


+CORR([gj^dient] , [%cover]) 


>[coverage] 






[location]) 










lO.a. CAUS 








4 


BAR([pH][#species 1 ] 


4 


Inconclusive 


[pH] -^[vegetation 
type] 


i 


Hi 


MAP([pH] [species 1] 
[species2], [x-loc][y-loc]) 




[#species2], [location]) 








[#species 1 ][#species2] 










lO.b.CAUS 












4 


Inconclusive 


[pH]->[vegetation 

quantity] 




n 


n 




it 




REL([pH], [location]) 


11. a. CAUS 






MAP([speciesl],[x-loc][y- 




AVG[#buds/blooms] 


4 


-CORR([smog]| [x-loc] , 


[pollution]— > 
[productivity] 


1 


[distance to parkins lotl|| 
[#buds/blooms SDecies 1 ] 


loc]) 

TAB([row#][column#], 

[#buds/blooms]) 


4 


LINEG([AVG], [x-loc]) 




[productivity]) 


ll.b.CAUS 


V 


[distance to parkins lot]|| 


TAB([row#][column#], 




AVG[height] 


4 




[pollution] — 
>[growth] 




[height] 


[height]) 


4 


LINEG([AVG], [x-loc]) 




CORR([x-loc][growth]) 
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12.a.CAUS 

[feature]— * [aspect] || [rel frequency MAP([plants][pH]) 
>[biodiversity] species] 



♦ BAR([rel. frequency], [aspect]) +♦ 

*♦ 



+♦ REL([aspect], [species]) 
*♦ -CORR([speciesl], [light]) 



* 4 +CORR([speciesl], [moisture]) 
+CORR([species2], 

** [heat/light]) 



-CORR([species2], [moisture]) 



llb.CAUS 

[substrate]— * 

>[biodiversity] frequency species] 




<none> 






<none> 



Key: CORR - correlational statement/claim made (i.e., one variables measure covarys with another measure) 
REL - relational claim made (i.e., with categorical variables) 

CAUS - causal statement made (i.e., one variable causes another to change) 

TAB - data represented in a table 
MAP - data represented in a map/drawing 
TRANS - landscape viewed in a side profile 
BAR([y][x]) - Bar graph used 
SCATTER([y][x]) - Scatterplot graph used 
LINEG([y][x]) - Line graph used 

WHISKER([AVG][species]) - categorical graph plotting averages w/ range in each category used 

AVGfvariable] - average given for a variable 

single underline - conceptual problem in operationalization of variable 

double underline - problem in implementation of operationalization of variable (e.g., replication, sampling) 
¥ - problem ot relation of variable 1 to variable2 

, ■ transformation not related to original question 
. - inappropriately used graph (e.g., bar instead of scatterplot) 

4 - claim not related to origi na i question 

* - claim is not conceptually related to the data which was collected and presented 



ERIC 
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Table 7 

Claims made in the reports of the three case studies 



Claims in Reports 

Case Study 1 1 . We found that the area with the highest horsetail density had the horsetails 

with the tallest height. This could be that the areas with the highest 
density had the most favorable conditions such as nutrients, shade, and 
light, which allowed them to grow taller. 

2. We found that the areas with the least moisture content had the horsetails 
with the tallest height. This could be that the taller horsetails have 
absorbed more water (nutrients) thus reducing the moisture content of the 
soil. 

3. Additional questions that might require further investigation might 
include how competition with other plant species affects the height and 
density of horsetails; how soil type affects soil moisture; and why the 
density of horsetails decreased with proximity to the road. 

Case Study 2 1 . Moisture at the top of the slope on a sunny day is greater than in the 

middle or bottom. This is probably due to moisture (e.g., rainfall) hitting 
the soil at the top more often than elsewhere because of how various 
plants on the slope prevent moisture from accessing the soil, i.e., there are 
fewer plants at the top of the slope. Gradually, rainwater at the top would 
run down the hill because of gravity. 

2. Further tests could help determine the effects of plant type versus position 
on slope. We also might learn more by taking moisture readings in the soil 
during or after various degrees of rainfall. Presumably, different plant 
types utilize different amounts of moisture so we could test soil moisture 
around various types. 

3. Pattern of fern placement on the slope is not related to moisture content of 
the soil. 

4. Further tests could indicate whether fern placement pattern is due to 
competition from other plants, symbiotic relations with other plants, 
availability of sun versus shade, pH of soil, wind resulting in fertilization 
and distribution of spores, animal movement resulting in distribution of 
spores, animal and human traffic effecting survival of plants. 

Case Study 3 1 . It is possible that smog decreases both productivity and growth of lupines. 

2. Growth shows a more consistent correlation with distance from the 
parking lot (used as a measure of concentration) compared with the 
productivity. 

3. The fact that a street exists on the opposite side of the parking lot 
indicates why a decrease in productivity occurs in quadrants after B. 

4. One parameter that was not controlled in this investigation was the 

influence of smog. 
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